Importing Libraries¶

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt 
import calendar
from statsmodels.tsa.seasonal import seasonal_decompose
import statsmodels.api as sm
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.tsa.arima.model import ARIMA
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error
from prophet import Prophet
import matplotlib.patches as mpatches

Data Cleaning and Preparation¶

In [2]:
# Loading transactions dataset

transactions = pd.read_csv("transactions.csv", parse_dates=["history_date"],index_col='id')
transactions.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 381968 entries, 133692 to 430911
Data columns (total 6 columns):
 #   Column        Non-Null Count   Dtype         
---  ------        --------------   -----         
 0   history_date  381968 non-null  datetime64[ns]
 1   item_id       381968 non-null  int64         
 2   price         381968 non-null  float64       
 3   inventory     381967 non-null  float64       
 4   sales         381968 non-null  float64       
 5   category_id   381968 non-null  int64         
dtypes: datetime64[ns](1), float64(3), int64(2)
memory usage: 20.4 MB
In [3]:
transactions.head()
Out[3]:
history_date item_id price inventory sales category_id
id
133692 2014-08-26 394846296 12.81 183.0 461.16 1
134256 2014-08-27 394846296 12.81 183.0 576.45 1
134820 2014-08-28 394846296 12.81 183.0 397.11 1
135384 2014-08-29 394846296 12.81 183.0 397.11 1
135948 2014-08-30 394846296 12.81 183.0 602.07 1
In [4]:
# Loading promotion dataset

promotions = pd.read_csv("promos.csv", parse_dates=["promo_start_dt", "promo_end_dt"])
promotions.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 75 entries, 0 to 74
Data columns (total 4 columns):
 #   Column          Non-Null Count  Dtype         
---  ------          --------------  -----         
 0   item_id         75 non-null     int64         
 1   promo_type      75 non-null     object        
 2   promo_start_dt  75 non-null     datetime64[ns]
 3   promo_end_dt    75 non-null     datetime64[ns]
dtypes: datetime64[ns](2), int64(1), object(1)
memory usage: 2.5+ KB
In [5]:
promotions.head()
Out[5]:
item_id promo_type promo_start_dt promo_end_dt
0 394930651 PROMO_A 2014-09-04 2014-10-04
1 394914459 PROMO_C 2014-09-25 2014-10-25
2 512317760 PROMO_C 2014-02-04 2014-03-06
3 394851407 PROMO_A 2014-09-28 2014-10-28
4 394904090 PROMO_B 2014-08-29 2014-09-28
In [6]:
# Checking for NAN values in transaction dataset
transactions.isnull().sum()
Out[6]:
history_date    0
item_id         0
price           0
inventory       1
sales           0
category_id     0
dtype: int64
In [7]:
# Checking for NAN values in promotion dataset
promotions.isnull().sum()
Out[7]:
item_id           0
promo_type        0
promo_start_dt    0
promo_end_dt      0
dtype: int64
In [8]:
# Allocation of missing sales with 0
transactions["sales"].fillna(0, inplace=True)

# Correction of negative sales (returns)
transactions["returns"] = transactions["sales"].clip(upper=0)
transactions["sales"] = transactions["sales"].clip(lower=0) - transactions["returns"]

# Calculation of units sold (sales / price)
transactions["units_sold"] = transactions["sales"] / transactions["price"]
transactions["units_sold"] = transactions["units_sold"].apply(np.floor)

# For better visualisation
transactions = transactions[["history_date","item_id","category_id","inventory",'price','sales','units_sold']]
transactions.head()
Out[8]:
history_date item_id category_id inventory price sales units_sold
id
133692 2014-08-26 394846296 1 183.0 12.81 461.16 36.0
134256 2014-08-27 394846296 1 183.0 12.81 576.45 45.0
134820 2014-08-28 394846296 1 183.0 12.81 397.11 31.0
135384 2014-08-29 394846296 1 183.0 12.81 397.11 31.0
135948 2014-08-30 394846296 1 183.0 12.81 602.07 47.0
In [9]:
# Merging Transaction and Promotion Datasets

transactions_promo = transactions.merge(promotions, on="item_id", how="left")
transactions_promo.dropna(subset=['promo_type'], inplace=True)
transactions_promo['is_in_promotion'] =  (transactions_promo['history_date'] >= transactions_promo['promo_start_dt']) & (transactions_promo['history_date'] <= transactions_promo['promo_end_dt']) 
transactions_promo.head()
Out[9]:
history_date item_id category_id inventory price sales units_sold promo_type promo_start_dt promo_end_dt is_in_promotion
70 2014-01-01 394846541 3 161.0 80.39 5627.30 70.0 PROMO_B 2015-11-12 2015-12-12 False
71 2014-01-02 394846541 3 161.0 80.39 4582.23 57.0 PROMO_B 2015-11-12 2015-12-12 False
72 2014-01-03 394846541 3 161.0 80.39 5225.35 65.0 PROMO_B 2015-11-12 2015-12-12 False
73 2014-01-04 394846541 3 161.0 80.39 5948.86 74.0 PROMO_B 2015-11-12 2015-12-12 False
74 2014-01-05 394846541 3 161.0 80.39 5546.91 69.0 PROMO_B 2015-11-12 2015-12-12 False

Exploratory Data Analysis¶

Total Monthly Sales¶

In [10]:
# Best Month on Sales

transactions_promo['month'] = transactions_promo['history_date'].dt.month
monthly_sales = transactions_promo.groupby('month')['sales'].sum()
print(monthly_sales)
month
1     2.104428e+07
2     1.910062e+07
3     2.111897e+07
4     2.045971e+07
5     2.079916e+07
6     2.029153e+07
7     2.364601e+07
8     2.259255e+07
9     1.947083e+07
10    1.973734e+07
11    1.696516e+07
12    1.978192e+07
Name: sales, dtype: float64
In [11]:
# Plotting the monthly sales

plt.figure(figsize=(10, 6))
plt.bar(monthly_sales.index, monthly_sales.values, color='royalblue', alpha=0.8)
plt.xlabel('Month')
plt.ylabel('Total Sales')
plt.title('Monthly Sales')
plt.xticks(range(1, 13), calendar.month_name[1:13], rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.ylim(0, max(monthly_sales.values) * 1.1)

plt.tight_layout()
plt.show()

Best monthly sales were observed in the month of July, followed by the month August.

Product Category Distribution by Month¶

In [12]:
# Grouping the data by month and category, and calculating the count of each category

monthly_category_counts = transactions_promo.groupby(['month', 'category_id']).size().unstack(fill_value=0)

print(monthly_category_counts)
category_id    1    2    3    4    5    6
month                                    
1            893  558  562  738  496  866
2            791  501  508  649  453  792
3            868  558  558  713  496  868
4            803  540  548  690  460  840
5            775  541  589  713  465  868
6            750  536  568  693  459  829
7            775  558  543  713  473  830
8            802  567  571  684  535  874
9            755  600  570  630  510  845
10           752  593  588  627  524  821
11           693  381  354  435  375  588
12           614  322  316  436  322  529
In [13]:
# Creating a color palette for the categories
category_colors = plt.cm.tab20(np.linspace(0, 1, len(monthly_category_counts.columns)))

# Plotting the product category distribution for each month
monthly_category_counts.plot(kind='bar', stacked=True, color=category_colors, alpha=0.8,figsize=(10, 6))
plt.xlabel('Month')
plt.ylabel('Count')
plt.title('Product Category Distribution by Month')
plt.legend(loc='upper right',title='category')#bbox_to_anchor=(1.02, 1))
plt.xticks(range(12), calendar.month_name[1:13], rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

Categories 1 and 6 products remain popular throughout the year, however the numbers for each category reduces considerably in the months of November and December.

Total Sales based on Month - Promotion v/s Non Promotion¶

In [14]:
# Grouping the data by category and calculating the total sales for each promoted category
promoted_category_sales = transactions_promo[transactions_promo['is_in_promotion'] == True].groupby('category_id')['sales'].sum()
non_promoted_category_sales = transactions_promo[transactions_promo['is_in_promotion'] == False].groupby('category_id')['sales'].sum()

# Sorting the products based on total sales in descending order
sorted_category_sales_in_promo = promoted_category_sales.sort_values(ascending=False)
sorted_category_sales_no_promo = non_promoted_category_sales.sort_values(ascending=False)
In [15]:
# Plotting the total sales for each promoted product

fig, axs = plt.subplots(1,2,figsize=(20,6))

axs[0].bar(sorted_category_sales_in_promo.index, sorted_category_sales_in_promo.values, color='royalblue', alpha=0.8)
axs[0].set_xlabel('Category')
axs[0].set_ylabel('Total Sales')
axs[0].set_title('Total Sales of Promoted Products - Category')
axs[0].grid(axis='y', linestyle='--', alpha=0.5)

axs[1].bar(sorted_category_sales_no_promo.index, sorted_category_sales_no_promo.values, color='red', alpha=0.8)
axs[1].set_xlabel('Category')
axs[1].set_ylabel('Total Sales')
axs[1].set_title('Total Sales of Products (out of promotion period) - Category')
axs[1].grid(axis='y', linestyle='--', alpha=0.5)

plt.tight_layout()
plt.show()
  • The most sales during the promotion period come from category 3 whereas the least sales come from category 4.
  • The most sales out of promotion period come from category 3 whereas the least sales come from category 4. So, this would not be a relevant way to see which categories were at most benefit due to the promotion.

Category Performance based on Promotion¶

In [16]:
# Calculating the ratio of non-promoted sales to promoted sales for each category to see which category was positively impacted the most

ratio = promoted_category_sales / non_promoted_category_sales
sorted_ratio = ratio.sort_values(ascending=False)
print(sorted_ratio)
category_id
2    0.084857
4    0.064733
5    0.060578
3    0.053107
6    0.050373
1    0.036373
Name: sales, dtype: float64
In [20]:
# Plotting the ratio of non-promoted sales to promoted sales for each category

plt.figure(figsize=(8, 4))
ratio.plot(kind='bar', color='royalblue', alpha=0.8)
plt.xlabel('Category')
plt.ylabel('Ratio (Promoted Sales / Non Promoted Sales)')
plt.title('Ratio of Promoted Sales to Non Promoted Sales by Category')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

Here we are can clearly see that products from category 2 were at most benefit due to the promotion campaigns however, category 1 took the least benefit out of the other categories from promotion.

Top 10 Products with Highest Total Sales during Promotion Period¶

In [21]:
id_category = transactions_promo.groupby('item_id')['category_id'].last()

#Grouping the data by product and calculating the total sales for each promoted product
promoted_product_sales = transactions_promo[transactions_promo['is_in_promotion'] == True].groupby('item_id')['sales'].sum()

#Sorting the products based on total sales in descending order
sorted_product_sales = promoted_product_sales.sort_values(ascending=False)
top_10_products = sorted_product_sales.head(10)

#print(top_10_products)
In [22]:
product_names = top_10_products.index
product_names = [str(id_) for id_ in product_names]

sales = top_10_products.values

# Plotting the sales for the top 10 products
plt.figure(figsize=(12, 6))
plt.bar(product_names, sales, color='royalblue', alpha=0.8)
plt.xlabel('Product')
plt.ylabel('Total Sales')
plt.title('Top 10 Products with Highest Total Sales during Promotion')
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

for i in product_names:
    print(f"Product: {i} --> category : {id_category.loc[int(i)]}")
Product: 512464651 --> category : 3
Product: 515775957 --> category : 6
Product: 512464646 --> category : 3
Product: 515775953 --> category : 3
Product: 512464615 --> category : 1
Product: 512320017 --> category : 5
Product: 515775902 --> category : 5
Product: 512320013 --> category : 6
Product: 512464613 --> category : 6
Product: 512319985 --> category : 1

These are the top 10 highest sales products during promotion period. However, it does not indicate that which products took the most benefit due to the promotion campaigns.

Top 10 Products with Highest Total Sales during Non Promotion Period¶

In [23]:
# Grouping the data by category and calculate the total sales for each non_promoted product
non_promoted_product_sales = transactions_promo[transactions_promo['is_in_promotion'] == False].groupby('item_id')['sales'].sum()

# Sorting the products based on total sales in descending order
sorted_product_sales_non = non_promoted_product_sales.sort_values(ascending=False)

# Sorting the products based on total sales in descending order
top_10_products = sorted_product_sales_non.head(10)
In [24]:
product_names = top_10_products.index
product_names = [str(id_) for id_ in product_names]

sales = top_10_products.values

# Plotting the sales for the top 10 products
plt.figure(figsize=(12, 6))
plt.bar(product_names, sales, color='royalblue', alpha=0.8)
plt.xlabel('Product')
plt.ylabel('Total Sales')
plt.title('Top 10 Products with Highest Total Sales (out of Promotion period)')
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

for i in product_names:
    print(f"Product: {i} --> category : {id_category.loc[int(i)]}")
Product: 512464651 --> category : 3
Product: 512464615 --> category : 1
Product: 512464646 --> category : 3
Product: 515775957 --> category : 6
Product: 515775953 --> category : 3
Product: 512319985 --> category : 1
Product: 512320013 --> category : 6
Product: 512320017 --> category : 5
Product: 512464613 --> category : 6
Product: 515775902 --> category : 5
  • These are the top 10 highest sales products during out of promotion period, we can see that same product (512464651) was on the top during the promotion, is also on the top without promotion.
  • Categories 3,6 and 5 are the frequent ones by clients.

Top 10 Products based on Sales Turnover due to running Promotion Campaign¶

In [25]:
# Calculating the ratio of Non Promoted sales to Promoted sales for each product
ratio_product = promoted_product_sales / non_promoted_product_sales
sorted_ratio_product = ratio_product.sort_values(ascending=False)

top_10_promoted_performance = sorted_ratio_product.head(10)
In [26]:
# Plotting the ratio of Non Promoted sales to Promoted sales for each product
product_names = top_10_promoted_performance.index
product_names = [str(id_) for id_ in product_names]

sales = top_10_promoted_performance.values

# Plotting the sales for the top 10 products
plt.figure(figsize=(12, 6))
plt.bar(product_names, sales, color='royalblue', alpha=0.8)
plt.xlabel('Product')
plt.ylabel('Total Sales')
plt.title('Top 10 Products with Best Turn Over during Promotion')
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

for i in product_names:
    print(f"Product: {i} --> category : {id_category.loc[int(i)]}")
Product: 394851407 --> category : 2
Product: 394930651 --> category : 6
Product: 394914459 --> category : 6
Product: 394924633 --> category : 3
Product: 394904090 --> category : 2
Product: 394882669 --> category : 3
Product: 394865583 --> category : 3
Product: 394885779 --> category : 5
Product: 394890521 --> category : 5
Product: 512317737 --> category : 5

These are the 10 products that were positively affected due to the promotion campaign.

In [27]:
# Adding products's prices if there had been no promotion (prices in counterfactual model)
# I assume that the last price before the promotional period would have continued had a promotion not been run
# The column called 'counterfactual_price'

transactions_promo.insert(transactions_promo.columns.get_loc("price")+1,"counterfactual_price", pd.NaT)
for index, row in transactions_promo.iterrows():
    if row['is_in_promotion']:
        # Get the most recent non-promotional price for the item_id
        date = row["history_date"]
        last_non_promo_price = transactions_promo.loc[(transactions_promo['item_id'] == row['item_id']) & (~transactions_promo['is_in_promotion']) & (transactions_promo['history_date'] < date) , 'price'].iloc[-1]
        transactions_promo.at[index, 'counterfactual_price'] = last_non_promo_price
    else:
        # If not on promotion, use the actual price
        transactions_promo.at[index, 'counterfactual_price'] = row['price']
In [28]:
# Adding engineered features

transactions_promo.insert(transactions_promo.columns.get_loc("history_date")+1,"day", transactions_promo["history_date"].dt.day_of_week)
transactions_promo.insert(transactions_promo.columns.get_loc("history_date") + 2, "week", transactions_promo["history_date"].dt.isocalendar().week)
transactions_promo.insert(transactions_promo.columns.get_loc("history_date")+4,"year", transactions_promo["history_date"].dt.year)
transactions_promo["units_sold_counterfactual"] = transactions_promo["sales"] / transactions_promo["counterfactual_price"]
In [29]:
# Aggregating that dataset on weeks where there is a unique combination of date and product

transactions_promo_weekly = transactions_promo.groupby(["item_id", pd.Grouper(key="history_date", freq="W-MON")]).agg({
    "units_sold": "sum",
    "units_sold_counterfactual": "sum",
    "sales": "sum",
    "inventory": "sum",
    "promo_type": "last",
    "price": "last",
    "category_id": "last",
    "is_in_promotion":"sum",
    "counterfactual_price": "last"
}).reset_index()

transactions_promo_weekly.rename(columns={'is_in_promotion': 'days_per_week_of_promotion'}, inplace=True)
transactions_promo_weekly["is_in_promotion"] = transactions_promo_weekly["days_per_week_of_promotion"] > 0
In [30]:
# Adding engineered features in the weekly aggregated dataset
transactions_promo_weekly.insert(transactions_promo_weekly.columns.get_loc("history_date") + 1, "week", transactions_promo_weekly["history_date"].dt.isocalendar().week)
transactions_promo_weekly.insert(transactions_promo_weekly.columns.get_loc("history_date") + 2, "month", transactions_promo_weekly["history_date"].dt.month)
transactions_promo_weekly.insert(transactions_promo_weekly.columns.get_loc("history_date")+3,"year", transactions_promo_weekly["history_date"].dt.year)
In [31]:
transactions_promo_weekly.head()
Out[31]:
item_id history_date week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion
0 394846541 2014-01-06 2 1 2014 418.0 418.0 33603.02 900.0 PROMO_B 80.39 3 0 80.39 False
1 394846541 2014-01-13 3 1 2014 515.0 516.0 41481.24 630.0 PROMO_B 80.39 3 0 80.39 False
2 394846541 2014-01-20 4 1 2014 528.0 528.0 42445.92 511.0 PROMO_B 80.39 3 0 80.39 False
3 394846541 2014-01-27 5 1 2014 491.0 492.0 39551.88 511.0 PROMO_B 80.39 3 0 80.39 False
4 394846541 2014-02-03 6 2 2014 512.0 512.0 41159.68 511.0 PROMO_B 80.39 3 0 80.39 False
In [32]:
filtered_df_non_promotion = transactions_promo_weekly[transactions_promo_weekly["is_in_promotion"] == 0]
filtered_df_promotion = transactions_promo_weekly[transactions_promo_weekly["is_in_promotion"] == 1]
In [33]:
filtered_df_non_promotion_v1 = transactions_promo[transactions_promo["is_in_promotion"] == 0]
filtered_df_non_promotion_v1 = filtered_df_non_promotion_v1.set_index('history_date')

filtered_df_promotion_v1 = transactions_promo[transactions_promo["is_in_promotion"] == 1]
filtered_df_promotion_v1 = filtered_df_promotion_v1.set_index('history_date')

Units Sold - Distribution based on Year and Category ID - Promotion Period v/s Non Promotion Period¶

In [34]:
# Grouping the data by promo_type and days_per_week_of_promotion and calculate the sum of units_sold
grouped_df = filtered_df_promotion.groupby(['category_id', 'year'])['units_sold'].sum().unstack()

fig, ax = plt.subplots(figsize=(8, 5))

# Plotting the stacked bar chart
grouped_df.plot(kind='bar', stacked=True, ax=ax)

# Customizing the plot
plt.xlabel('Category ID')
plt.ylabel('Units Sold')
plt.title('Units Sold - Distribution based on Year and Category ID - Promotion Period')
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.xticks(rotation=0)
plt.tight_layout()

# Displaying the plot
plt.show()

During the promotion period most of the units sold are from category 3 in the year 2014 and the least number of units sold come from category 4 specially from 2015 to 2018.

In [35]:
# Grouping the data by promo_type and days_per_week_of_promotion and calculate the sum of units_sold
grouped_df = filtered_df_non_promotion.groupby(['category_id', 'year'])['units_sold'].sum().unstack()

fig, ax = plt.subplots(figsize=(8, 5))

# Plotting the stacked bar chart
grouped_df.plot(kind='bar', stacked=True, ax=ax)

# Customizing the plot
plt.xlabel('Category ID')
plt.ylabel('Units Sold')
plt.title('Units Sold - Distribution based on Year and Category ID - Non-Promotion Period')
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.xticks(rotation=0)
plt.tight_layout()

# Displaying the plot
plt.show()

During the promotion period most of the units sold are from category 3 in the year 2014 and the least number of units sold come from category 4 specially from 2015 to 2018. However, in comparison to the promotion period category 1 does better in the non promotion period.

In [36]:
filtered_df_promotion_v1.index = pd.to_datetime(filtered_df_promotion_v1.index)
filtered_df_promotion_v1['day'] = filtered_df_promotion_v1.index.weekday
filtered_df_promotion_v1['day_month'] = filtered_df_promotion_v1.index.day

filtered_df_non_promotion_v1.index = pd.to_datetime(filtered_df_non_promotion_v1.index)
filtered_df_non_promotion_v1['day'] = filtered_df_non_promotion_v1.index.weekday
filtered_df_non_promotion_v1['day_month'] = filtered_df_non_promotion_v1.index.day

Units Sold - Distribution based Day of Month and Category ID - Promotion Period v/s Non Promotion Period¶

In [37]:
# Grouping the data by promo_type and days_per_week_of_promotion and calculate the sum of units_sold
grouped_df = filtered_df_promotion_v1.groupby(['day_month', 'category_id'])['units_sold'].sum().unstack()

fig, ax = plt.subplots(figsize=(10, 6))

# Plotting the stacked bar chart
grouped_df.plot(kind='bar', stacked=True, ax=ax)

# Customizing the plot
plt.xlabel('Day of Month')
plt.ylabel('Units Sold')
plt.title('Units Sold - Distribution based on Day of Month and Category ID - Promotion Period')
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.xticks(rotation=45)
plt.tight_layout()

# Displaying the plot
plt.show()

During the promotion period most of the units sold are observed on the 25th of the month followed by 4th and 5th day of the month and the least number of units sold are observed on 31st followed by 29th and 30th of the month.

In [38]:
# Grouping the data by promo_type and days_per_week_of_promotion and calculate the sum of units_sold
grouped_df = filtered_df_non_promotion_v1.groupby(['day_month', 'category_id'])['units_sold'].sum().unstack()

fig, ax = plt.subplots(figsize=(10, 6))

# Plotting the stacked bar chart
grouped_df.plot(kind='bar', stacked=True, ax=ax)

# Customizing the plot
plt.xlabel('Day of Month')
plt.ylabel('Units Sold')
plt.title('Units Sold - Distribution based on Day of Month and Category ID - Non-Promotion Period')
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.xticks(rotation=45)
plt.tight_layout()

# Displaying the plot
plt.show()

During the non promotion period most of the units sold are observed on the 1st and 3rd of the month and the least number of units sold are observed on 31st followed by 29th and 30th of the month. However, while in non promotion periods most sales are observed in the first 3 days of the month in comparison to the promotion period.

Units Sold - Distribution based Day of Week and Category ID - Promotion Period v/s Non Promotion Period¶

In [39]:
# Grouping the data by promo_type and days_per_week_of_promotion and calculate the sum of units_sold
grouped_df = filtered_df_promotion_v1.groupby(['day', 'category_id'])['units_sold'].sum().unstack()

fig, ax = plt.subplots(figsize=(8, 6))

# Plotting the stacked bar chart
grouped_df.plot(kind='bar', stacked=True, ax=ax)

# Customizing the plot
plt.xlabel('Day of Week')
plt.ylabel('Units Sold')
plt.title('Units Sold - Distribution based on Day of Week and Category ID - Promotion Period')
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.xticks(rotation=0)
plt.tight_layout()

# Displaying the plot
plt.show()

During the promotion period most of the units sold are observed on the 1st day of the week and the least number of units sold are observed on 4th day of the week. Throughout the week category 3 is the most popular and category 4 remains the least popular category.

In [40]:
# Grouping the data by promo_type and days_per_week_of_promotion and calculate the sum of units_sold
grouped_df = filtered_df_non_promotion_v1.groupby(['day', 'category_id'])['units_sold'].sum().unstack()

fig, ax = plt.subplots(figsize=(8, 6))

# Plotting the stacked bar chart
grouped_df.plot(kind='bar', stacked=True, ax=ax)

# Customizing the plot
plt.xlabel('Day of Week')
plt.ylabel('Units Sold')
plt.title('Units Sold - Distribution based on Day of Week and Category ID - Non-Promotion Period')
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.xticks(rotation=0)
plt.tight_layout()

# Displaying the plot
plt.show()

During the non promotion period most of the units sold are observed on the 2nd day of the week and the least number of units sold are observed on 7th day of the week. However, in comparison to the promotion period there isn't much different in the units sold throughout different categories in the non promotion period.

Outliers¶

Outliers can have a significant impact on the correlation coefficient and may distort the relationship between variables, and it can make our study erroneous.

In [41]:
for id_ in transactions_promo['item_id'].unique():
        
    transactions_promo_week_one_item = transactions_promo_weekly[(transactions_promo_weekly['item_id']==id_) & (transactions_promo_weekly['is_in_promotion']==False)]
    transactions_promo_week_one_item.head()
    Q1=transactions_promo_week_one_item["units_sold"].quantile(0.25)
    Q3=transactions_promo_week_one_item["units_sold"].quantile(0.75)
    IQR=Q3-Q1
    
    lower_bound = Q1 - 1.5*IQR
    upper_bound = Q3 + 1.5*IQR
    
    
    us_winsorized = transactions_promo_week_one_item[['units_sold']].clip(lower_bound,upper_bound)
    
    for index,row in us_winsorized.iterrows():
        transactions_promo_weekly.loc[index,'winsorized_units_sold'] =  row['units_sold'] 

    transactions_promo_week_one_item = transactions_promo_weekly[(transactions_promo_weekly['item_id']==id_) & (transactions_promo_weekly['is_in_promotion']==True) ]
    
    Q1=transactions_promo_week_one_item["units_sold"].quantile(0.05)
    Q3=transactions_promo_week_one_item["units_sold"].quantile(0.95)
    IQR=Q3-Q1
    
    lower_bound = Q1 - 1.5*IQR
    upper_bound = Q3 + 1.5*IQR
    
    us_winsorized = transactions_promo_week_one_item[["units_sold"]].clip(lower_bound,upper_bound)

    for index,row in us_winsorized.iterrows():
        transactions_promo_weekly.loc[index,'winsorized_units_sold'] =  row['units_sold']
        
transactions_promo_weekly.head()
Out[41]:
item_id history_date week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion winsorized_units_sold
0 394846541 2014-01-06 2 1 2014 418.0 418.0 33603.02 900.0 PROMO_B 80.39 3 0 80.39 False 418.0
1 394846541 2014-01-13 3 1 2014 515.0 516.0 41481.24 630.0 PROMO_B 80.39 3 0 80.39 False 515.0
2 394846541 2014-01-20 4 1 2014 528.0 528.0 42445.92 511.0 PROMO_B 80.39 3 0 80.39 False 528.0
3 394846541 2014-01-27 5 1 2014 491.0 492.0 39551.88 511.0 PROMO_B 80.39 3 0 80.39 False 491.0
4 394846541 2014-02-03 6 2 2014 512.0 512.0 41159.68 511.0 PROMO_B 80.39 3 0 80.39 False 512.0
In [ ]:
# Plotting data with and without outliers (units _sold)

category = 3
list_ids_one_category = np.unique(transactions_promo_weekly[transactions_promo_weekly["category_id"]==category]['item_id'])


fig, axs = plt.subplots(len(list_ids_one_category), 4, figsize=(20, 4*len(list_ids_one_category)))
i=0
for id_ in list_ids_one_category:
    df_one_item_of_category = transactions_promo_weekly[(transactions_promo_weekly['item_id']==id_) & (transactions_promo_weekly["is_in_promotion"]==False)]
    
    sns.scatterplot(data = df_one_item_of_category,x='history_date',y='units_sold',ax=axs[i,0], color='red',label='outliers')
    sns.scatterplot(data = df_one_item_of_category,x='history_date',y='winsorized_units_sold',ax=axs[i,0])
    sns.boxplot(data = df_one_item_of_category[['units_sold','winsorized_units_sold']],ax=axs[i,1])
    
    axs[i,0].set_title(f'Item ID: {id_}')
    axs[i,0].set_xlabel('Date')
    axs[i,0].set_ylabel('Units sold (no promotion)')
    axs[i,1].set_ylabel('No promotion')
    axs[i,0].tick_params(axis='x', rotation=90)
    axs[i,1].set_title(f'Box plot for Item ID: {id_}')
    axs[i,1].tick_params(axis='x', rotation=35)
    df_one_item_of_category = transactions_promo_weekly[(transactions_promo_weekly['item_id']==id_) & (transactions_promo_weekly["is_in_promotion"]==True)]
    
    sns.scatterplot(data = df_one_item_of_category,x='history_date',y='units_sold',ax=axs[i,2], color='red',label="outliers")
    sns.scatterplot(data = df_one_item_of_category,x='history_date',y='winsorized_units_sold',ax=axs[i,2])
    sns.boxplot(data = df_one_item_of_category[['units_sold','winsorized_units_sold']],ax=axs[i,3])
    
    
    axs[i,2].set_title(f'Item ID: {id_}')
    axs[i,2].set_xlabel('Date')
    axs[i,3].set_ylabel('With promotion')
    axs[i,2].tick_params(axis='x', rotation=90)
    axs[i,3].set_title(f'Box plot for Item ID: {id_}')
    axs[i,3].tick_params(axis='x', rotation=35)
    i+=1

    
plt.subplots_adjust(hspace=0.8, wspace=1.0)
plt.show()
  • Exploring the graphs above (boxplots), we can see that our data set (units sold) contains many outliers, particularly for the period without promotion.

this values are removed by using the concept of the interquartile range (IQR).

  • The data is divided into promotional and non-promotional periods, This approach allows us to analyze and handle outliers within the context of promotional activities.
  • Outliers are dealt with separately for each product, as each item has its own characteristics, so there is no pattern between sales of different products.

Outlier values have been winsorized instead of being dropped. The plot above shows the distribution of outliers and how they are transfomed.

In [43]:
transactions_promo_weekly["units_sold_counterfactual"] = transactions_promo_weekly["winsorized_units_sold"]
transactions_promo_weekly.drop('winsorized_units_sold', axis=1, inplace=True)
transactions_promo_weekly.head()
Out[43]:
item_id history_date week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion
0 394846541 2014-01-06 2 1 2014 418.0 418.0 33603.02 900.0 PROMO_B 80.39 3 0 80.39 False
1 394846541 2014-01-13 3 1 2014 515.0 515.0 41481.24 630.0 PROMO_B 80.39 3 0 80.39 False
2 394846541 2014-01-20 4 1 2014 528.0 528.0 42445.92 511.0 PROMO_B 80.39 3 0 80.39 False
3 394846541 2014-01-27 5 1 2014 491.0 491.0 39551.88 511.0 PROMO_B 80.39 3 0 80.39 False
4 394846541 2014-02-03 6 2 2014 512.0 512.0 41159.68 511.0 PROMO_B 80.39 3 0 80.39 False

Data Scaling and Standardization¶

In [44]:
from sklearn.preprocessing import MinMaxScaler, StandardScaler

# Selecting the columns to be scaled
columns_to_scale = ['sales', 'inventory', 'price', 'counterfactual_price']

transactions_promo_weekly_scaled = transactions_promo_weekly

# Converting categorical variables
transactions_promo_weekly_scaled['promo_type'] = pd.factorize(transactions_promo_weekly_scaled['promo_type'])[0]
transactions_promo_weekly_scaled['is_in_promotion'] = pd.factorize(transactions_promo_weekly_scaled['is_in_promotion'])[0]
transactions_promo_weekly_scaled['days_per_week_of_promotion'] = pd.factorize(transactions_promo_weekly_scaled['days_per_week_of_promotion'])[0]

# Performing Min-Max scaling (Normalization)
minmax_scaler = MinMaxScaler()
transactions_promo_weekly_scaled[columns_to_scale] = minmax_scaler.fit_transform(transactions_promo_weekly_scaled[columns_to_scale])

# Performing Standardization
standard_scaler = StandardScaler()
transactions_promo_weekly_scaled[columns_to_scale] = standard_scaler.fit_transform(transactions_promo_weekly_scaled[columns_to_scale])
In [45]:
transactions_promo_weekly_scaled.head()
Out[45]:
item_id history_date week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion
0 394846541 2014-01-06 2 1 2014 418.0 418.0 -0.085062 -0.539646 0 -0.303045 3 0 -0.30601 0
1 394846541 2014-01-13 3 1 2014 515.0 515.0 0.060938 -0.751634 0 -0.303045 3 0 -0.30601 0
2 394846541 2014-01-20 4 1 2014 528.0 528.0 0.078816 -0.845066 0 -0.303045 3 0 -0.30601 0
3 394846541 2014-01-27 5 1 2014 491.0 491.0 0.025183 -0.845066 0 -0.303045 3 0 -0.30601 0
4 394846541 2014-02-03 6 2 2014 512.0 512.0 0.054979 -0.845066 0 -0.303045 3 0 -0.30601 0

Correlation Analysis¶

Correlation between units sold and other variables¶

In [46]:
# Observing the correlation between different variables.

fig, axs = plt.subplots(1,2,figsize=(20,8))

sns.heatmap(transactions_promo_weekly_scaled.corr()[["units_sold"]], cmap="YlGnBu", annot = True,ax= axs[0])
sns.heatmap(transactions_promo_weekly_scaled[transactions_promo_weekly_scaled['is_in_promotion']==True].corr()[["units_sold"]], cmap="YlGnBu", annot = True,ax= axs[1])
axs[0].set_title("Correlation of data's variables")
axs[1].set_title("Correlation of data's variables during promotion")

fig.subplots_adjust(hspace=4.5)
plt.show()

Having analyzed the correlation matrix above, and in our case in particular which is studying the impact of a promotion, we can estimate that:

  • The units sold strongly correlate (positive correlation) with sales.
  • Positive correlation is observed with, is_in_promotion and days_per_week_of_promotion variables.
  • While being promotion, units sold is positively correlated with sales, promo_type and lift.
In [47]:
transactions_promo_weekly_scaled.head()
Out[47]:
item_id history_date week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion
0 394846541 2014-01-06 2 1 2014 418.0 418.0 -0.085062 -0.539646 0 -0.303045 3 0 -0.30601 0
1 394846541 2014-01-13 3 1 2014 515.0 515.0 0.060938 -0.751634 0 -0.303045 3 0 -0.30601 0
2 394846541 2014-01-20 4 1 2014 528.0 528.0 0.078816 -0.845066 0 -0.303045 3 0 -0.30601 0
3 394846541 2014-01-27 5 1 2014 491.0 491.0 0.025183 -0.845066 0 -0.303045 3 0 -0.30601 0
4 394846541 2014-02-03 6 2 2014 512.0 512.0 0.054979 -0.845066 0 -0.303045 3 0 -0.30601 0

Time Series Analysis¶

In [48]:
transactions_promo_weekly_scaled['history_date'] = pd.to_datetime(transactions_promo_weekly_scaled['history_date'])

transactions_promo_weekly_scaled.set_index('history_date', inplace=True)
In [49]:
transactions_promo_weekly_scaled.head()
Out[49]:
item_id week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion
history_date
2014-01-06 394846541 2 1 2014 418.0 418.0 -0.085062 -0.539646 0 -0.303045 3 0 -0.30601 0
2014-01-13 394846541 3 1 2014 515.0 515.0 0.060938 -0.751634 0 -0.303045 3 0 -0.30601 0
2014-01-20 394846541 4 1 2014 528.0 528.0 0.078816 -0.845066 0 -0.303045 3 0 -0.30601 0
2014-01-27 394846541 5 1 2014 491.0 491.0 0.025183 -0.845066 0 -0.303045 3 0 -0.30601 0
2014-02-03 394846541 6 2 2014 512.0 512.0 0.054979 -0.845066 0 -0.303045 3 0 -0.30601 0
In [50]:
# List of Products

print(transactions_promo_weekly_scaled["item_id"].unique())
[394846541 394848615 394851407 394857109 394858170 394860161 394865583
 394866466 394873848 394882669 394885779 394885885 394890521 394893706
 394904090 394907934 394909995 394914459 394915909 394917065 394917897
 394924633 394930015 394930651 394931359 394940184 394941031 394941377
 394942170 394942631 394950597 394951176 395052168 395357341 395368886
 395375136 395382145 395384129 511584598 512317690 512317697 512317702
 512317726 512317737 512317760 512317763 512319115 512319119 512319130
 512319152 512319154 512319978 512319985 512320013 512320017 512464613
 512464615 512464625 512464633 512464642 512464646 512464651 512464658
 514002189 515375115 515702203 515775902 515775912 515775929 515775953
 515775957 516001717 516001998 516002000 516007566]
In [51]:
# Filtering products based on category and years

filtered_data_cat1_14 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 1) & (transactions_promo_weekly_scaled['year'] == 2014)]["item_id"].to_numpy()
filtered_data_cat1_15 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 1) & (transactions_promo_weekly_scaled['year'] == 2015)]["item_id"].to_numpy()
filtered_data_cat1_16 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 1) & (transactions_promo_weekly_scaled['year'] == 2016)]["item_id"].to_numpy()
filtered_data_cat1_17 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 1) & (transactions_promo_weekly_scaled['year'] == 2017)]["item_id"].to_numpy()

filtered_data_cat2_14 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 2) & (transactions_promo_weekly_scaled['year'] == 2014)]["item_id"].to_numpy()
filtered_data_cat2_15 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 2) & (transactions_promo_weekly_scaled['year'] == 2015)]["item_id"].to_numpy()
filtered_data_cat2_16 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 2) & (transactions_promo_weekly_scaled['year'] == 2016)]["item_id"].to_numpy()
filtered_data_cat2_17 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 2) & (transactions_promo_weekly_scaled['year'] == 2017)]["item_id"].to_numpy()

filtered_data_cat3_14 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 3) & (transactions_promo_weekly_scaled['year'] == 2014)]["item_id"].to_numpy()
filtered_data_cat3_15 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 3) & (transactions_promo_weekly_scaled['year'] == 2015)]["item_id"].to_numpy()
filtered_data_cat3_16 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 3) & (transactions_promo_weekly_scaled['year'] == 2016)]["item_id"].to_numpy()
filtered_data_cat3_17 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 3) & (transactions_promo_weekly_scaled['year'] == 2017)]["item_id"].to_numpy()

filtered_data_cat4_14 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 4) & (transactions_promo_weekly_scaled['year'] == 2014)]["item_id"].to_numpy()
filtered_data_cat4_15 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 4) & (transactions_promo_weekly_scaled['year'] == 2015)]["item_id"].to_numpy()
filtered_data_cat4_16 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 4) & (transactions_promo_weekly_scaled['year'] == 2016)]["item_id"].to_numpy()
filtered_data_cat4_17 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 4) & (transactions_promo_weekly_scaled['year'] == 2017)]["item_id"].to_numpy()

filtered_data_cat5_14 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 5) & (transactions_promo_weekly_scaled['year'] == 2014)]["item_id"].to_numpy()
filtered_data_cat5_15 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 5) & (transactions_promo_weekly_scaled['year'] == 2015)]["item_id"].to_numpy()
filtered_data_cat5_16 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 5) & (transactions_promo_weekly_scaled['year'] == 2016)]["item_id"].to_numpy()
filtered_data_cat5_17 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 5) & (transactions_promo_weekly_scaled['year'] == 2017)]["item_id"].to_numpy()

filtered_data_cat6_14 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 6) & (transactions_promo_weekly_scaled['year'] == 2014)]["item_id"].to_numpy()
filtered_data_cat6_15 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 6) & (transactions_promo_weekly_scaled['year'] == 2015)]["item_id"].to_numpy()
filtered_data_cat6_16 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 6) & (transactions_promo_weekly_scaled['year'] == 2016)]["item_id"].to_numpy()
filtered_data_cat6_17 = transactions_promo_weekly_scaled[(transactions_promo_weekly_scaled['category_id'] == 6) & (transactions_promo_weekly_scaled['year'] == 2017)]["item_id"].to_numpy()
In [52]:
# Filtering out products that are found in most of the years

common_values_cat1 = np.intersect1d(np.intersect1d(np.intersect1d(filtered_data_cat1_14, filtered_data_cat1_15), filtered_data_cat1_16), filtered_data_cat1_17)
common_values_cat2 = np.intersect1d(np.intersect1d(np.intersect1d(filtered_data_cat2_14, filtered_data_cat2_15), filtered_data_cat2_16), filtered_data_cat2_17)
common_values_cat3 = np.intersect1d(np.intersect1d(np.intersect1d(filtered_data_cat3_14, filtered_data_cat3_15), filtered_data_cat3_16), filtered_data_cat3_17)
common_values_cat4 = np.intersect1d(np.intersect1d(np.intersect1d(filtered_data_cat4_14, filtered_data_cat4_15), filtered_data_cat4_16), filtered_data_cat4_17)
common_values_cat5 = np.intersect1d(np.intersect1d(np.intersect1d(filtered_data_cat5_14, filtered_data_cat5_15), filtered_data_cat5_16), filtered_data_cat5_17)
common_values_cat6 = np.intersect1d(np.intersect1d(np.intersect1d(filtered_data_cat6_14, filtered_data_cat6_15), filtered_data_cat6_16), filtered_data_cat6_17)
In [53]:
# This list has products from each category

selected_products_allcat = [512319985, 512317697, 512319154, 512464642, 516002000, 395368886]
In [54]:
# Filtering data for all 6 products

filtered_df = transactions_promo_weekly_scaled[transactions_promo_weekly_scaled['item_id'].isin(selected_products_allcat)]
In [55]:
grouped_data = filtered_df.groupby('item_id')

for item_id, group in grouped_data:
    
    fig, axs = plt.subplots(nrows=1, ncols=4, figsize=(20, 5))
    
    # Plotting the time series
    axs[0].plot(group.index, group['sales'])
    axs[0].set_title(f'Time Series - Item ID: {item_id}')
    axs[0].set_xlabel('Date')
    axs[0].set_ylabel('Sales')
    axs[0].tick_params(axis='x', rotation=45)
    
    decomposition = seasonal_decompose(group['sales'], model='additive', extrapolate_trend='freq', period=5)

    seasonal_component = decomposition.seasonal
    seasonal_component.index.freq = pd.infer_freq(group.index)

    # Plotting the seasonal component
    axs[1].plot(group.index, seasonal_component)
    axs[1].set_title(f'Seasonal Component - Item ID: {item_id}')
    axs[1].set_xlabel('Date')
    axs[1].set_ylabel('Seasonal Component')
    axs[1].tick_params(axis='x', rotation=45)
    
    acf = sm.graphics.tsa.plot_acf(group['sales'], lags=30, ax=axs[2])
    acf.axes[0].set_xlabel('Lags')
    
    pacf = sm.graphics.tsa.plot_pacf(group['sales'], lags=15, method='ywm', ax=axs[3])
    pacf.axes[0].set_xlabel('Lags')
    
    plt.tight_layout()
    plt.show()

We can make the following observations:

  • There are several autocorrelations that are significantly non-zero. Therefore, the time series is non-random.
  • High degree of autocorrelation between adjacent (lag = 1) and near-adjacent observations in PACF plot
  • From both the ACF and PACF plot, we can see a strong correlation with the adjacent observation (lag = 1) and also at a lag of 12, which is the value of T.

Data Splitting and Model Fitting¶

In [56]:
# Choosing features required for the model
filtered_df = transactions_promo_weekly[["item_id", "category_id", "units_sold_counterfactual", "counterfactual_price", "week", "month", "year", "is_in_promotion", "days_per_week_of_promotion", "promo_type", "inventory"]]
product_list = filtered_df["item_id"].unique()
In [57]:
filtered_df_non_promotion = transactions_promo_weekly[transactions_promo_weekly["is_in_promotion"] == 0]
filtered_df_promotion = transactions_promo_weekly[transactions_promo_weekly["is_in_promotion"] == 1]

SARIMA Model¶

In [58]:
# SARIMA model, train-test split and model evaluation

def SARIMA_train_test(df):
    
    df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
    train_data, test_data = train_test_split(df, test_size=0.2, shuffle=False)

    # Defining SARIMA parameters
    order = (1, 0, 0)  # (p, d, q)
    seasonal_order = (1, 0, 0, 52)  # (P, D, Q, S) - adjust the seasonal period 'S' accordingly

    # Setting the frequency of the time series data
    train_data = train_data.asfreq('W-MON')
    test_data = test_data.asfreq('W-MON')

    # Preparing the training data
    train_series = train_data['units_sold_counterfactual']

    # Creating and fitting the SARIMA model
    model = SARIMAX(train_series, order=order, seasonal_order=seasonal_order)
    fitted_model = model.fit(maxiter=1000)

    coefficients = fitted_model.params

    
    # Forecasting values for the test data
    y_pred = pd.Series(np.array(fitted_model.forecast(len(test_data))))   
    y_true =  pd.Series(np.array(test_data['units_sold_counterfactual']))
    
    # Computing accuracy measures
    mse = np.mean((y_true - y_pred)**2)  # Mean Squared Error
    rmse = np.sqrt(mse)  # Root Mean Squared Error
    mae = np.mean(np.abs(y_true - y_pred))  # Mean Absolute Error
    mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100  # Mean Absolute Percentage Error


    evaluation_metrics = {
                          'Mean Absolute Error': mae,
                          'Mean Squared Error': mse,
                          'Root Mean Squared Error': rmse,
                          'MAPE': mape
                         }
    
    y_pred_test = pd.DataFrame()
    y_pred_test["history_date"] = test_data.index
    y_pred_test["SARIMA Predictions"] = y_pred
    
    return y_pred_test, evaluation_metrics, fitted_model
In [59]:
filtered_df_non_promotion.head()
Out[59]:
item_id week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion
history_date
2014-01-06 394846541 2 1 2014 418.0 418.0 -0.085062 -0.539646 0 -0.303045 3 0 -0.30601 0
2014-01-13 394846541 3 1 2014 515.0 515.0 0.060938 -0.751634 0 -0.303045 3 0 -0.30601 0
2014-01-20 394846541 4 1 2014 528.0 528.0 0.078816 -0.845066 0 -0.303045 3 0 -0.30601 0
2014-01-27 394846541 5 1 2014 491.0 491.0 0.025183 -0.845066 0 -0.303045 3 0 -0.30601 0
2014-02-03 394846541 6 2 2014 512.0 512.0 0.054979 -0.845066 0 -0.303045 3 0 -0.30601 0
In [60]:
# Model Fitting SARIMA

models_SARIMA = []

for index, item in enumerate(product_list):
    var_name_prediction = "prediction_" + str(item) + "_SARIMA"
    var_name_em = "evaluation_metrics_" + str(item) + "_SARIMA"
    var_name_model = "fitted_model_" + str(item) + "_SARIMA"
    var_name = filtered_df_non_promotion[filtered_df_non_promotion['item_id'] == item]
    globals()[var_name_prediction], globals()[var_name_em], SARIMA_model = SARIMA_train_test(var_name)
    models_SARIMA.append(SARIMA_model)
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.78271D+00    |proj g|=  5.25536D-02

At iterate    5    f=  3.77297D+00    |proj g|=  2.13717D-02

At iterate   10    f=  3.77083D+00    |proj g|=  4.84240D-03

At iterate   15    f=  3.76721D+00    |proj g|=  1.15851D-02

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     19     26      1     0     0   3.709D-06   3.767D+00
  F =   3.7668535170769686     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.29759D+01    |proj g|=  9.27661D+01

At iterate    5    f=  4.69370D+00    |proj g|=  1.01476D+00

At iterate   10    f=  4.03643D+00    |proj g|=  3.43543D-02

At iterate   15    f=  4.00344D+00    |proj g|=  1.95374D-03

At iterate   20    f=  3.99974D+00    |proj g|=  1.86872D-02

At iterate   25    f=  3.99624D+00    |proj g|=  1.33914D-02

At iterate   30    f=  3.99489D+00    |proj g|=  8.02551D-03

At iterate   35    f=  3.99431D+00    |proj g|=  2.84183D-03

At iterate   40    f=  3.99395D+00    |proj g|=  1.28867D-03

At iterate   45    f=  3.99371D+00    |proj g|=  2.65894D-03

At iterate   50    f=  3.99357D+00    |proj g|=  1.06420D-03

At iterate   55    f=  3.99346D+00    |proj g|=  4.40130D-04

At iterate   60    f=  3.99339D+00    |proj g|=  6.51211D-04

At iterate   65    f=  3.99336D+00    |proj g|=  1.06652D-03

At iterate   70    f=  3.99334D+00    |proj g|=  2.05566D-03

At iterate   75    f=  3.99333D+00    |proj g|=  2.12214D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     75    104      1     0     0   2.122D-05   3.993D+00
  F =   3.9933296997362593     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.79922D+00    |proj g|=  2.65782D+00

At iterate    5    f=  6.47485D+00    |proj g|=  5.40514D-03

At iterate   10    f=  6.47477D+00    |proj g|=  1.34448D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
At iterate   15    f=  6.47456D+00    |proj g|=  4.89504D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     19     21      1     0     0   6.004D-06   6.474D+00
  F =   6.4744779455791805     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.36613D+00    |proj g|=  5.43507D-02
 This problem is unconstrained.
At iterate    5    f=  4.34120D+00    |proj g|=  6.85811D-04

At iterate   10    f=  4.34046D+00    |proj g|=  2.50267D-02

At iterate   15    f=  4.33520D+00    |proj g|=  2.52870D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     19     27      1     0     0   6.497D-06   4.335D+00
  F =   4.3350558012247431     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.67913D+00    |proj g|=  2.66697D-02

At iterate    5    f=  3.66242D+00    |proj g|=  2.24267D-02

At iterate   10    f=  3.64624D+00    |proj g|=  1.68066D-02

At iterate   15    f=  3.64403D+00    |proj g|=  5.21916D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     18     24      1     0     0   7.335D-07   3.644D+00
  F =   3.6440227459787442     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.07215D+00    |proj g|=  2.97004D-02

At iterate    5    f=  5.06547D+00    |proj g|=  1.54649D-02

At iterate   10    f=  5.04543D+00    |proj g|=  1.31515D-02

At iterate   15    f=  5.04517D+00    |proj g|=  3.54863D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     15     19      1     0     0   3.549D-06   5.045D+00
  F =   5.0451682434333955     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.12977D+01    |proj g|=  6.95219D+01
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  7.23542D+00    |proj g|=  1.19070D+00

At iterate   10    f=  6.01242D+00    |proj g|=  6.52911D-02

At iterate   15    f=  5.98757D+00    |proj g|=  1.54700D-04

At iterate   20    f=  5.98755D+00    |proj g|=  4.42595D-03

At iterate   25    f=  5.98685D+00    |proj g|=  3.83234D-03

At iterate   30    f=  5.98647D+00    |proj g|=  4.91370D-03

At iterate   35    f=  5.98614D+00    |proj g|=  4.88962D-03

At iterate   40    f=  5.98596D+00    |proj g|=  3.12476D-03

At iterate   45    f=  5.98585D+00    |proj g|=  5.70032D-04

At iterate   50    f=  5.98578D+00    |proj g|=  1.29461D-03

At iterate   55    f=  5.98575D+00    |proj g|=  8.57963D-04

At iterate   60    f=  5.98571D+00    |proj g|=  1.01799D-03

At iterate   65    f=  5.98567D+00    |proj g|=  3.44140D-04

At iterate   70    f=  5.98565D+00    |proj g|=  3.74126D-04

At iterate   75    f=  5.98564D+00    |proj g|=  5.37696D-04

At iterate   80    f=  5.98563D+00    |proj g|=  3.09838D-04

At iterate   85    f=  5.98562D+00    |proj g|=  3.06333D-04

At iterate   90    f=  5.98562D+00    |proj g|=  2.97323D-04

At iterate   95    f=  5.98561D+00    |proj g|=  1.58135D-04

At iterate  100    f=  5.98560D+00    |proj g|=  1.74765D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3    102    137      1     0     0   7.284D-06   5.986D+00
  F =   5.9856041219265581     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.70401D+00    |proj g|=  2.48362D-02

At iterate    5    f=  3.69036D+00    |proj g|=  2.03869D-02

At iterate   10    f=  3.68159D+00    |proj g|=  2.79868D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     19      1     0     0   1.597D-06   3.682D+00
  F =   3.6815759389320299     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.07917D+00    |proj g|=  2.31772D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  4.07557D+00    |proj g|=  3.99062D-03

At iterate   10    f=  4.07106D+00    |proj g|=  1.33655D-02

At iterate   15    f=  4.06955D+00    |proj g|=  2.68548D-03

At iterate   20    f=  4.06891D+00    |proj g|=  3.85835D-03

At iterate   25    f=  4.06850D+00    |proj g|=  3.73235D-03

At iterate   30    f=  4.06829D+00    |proj g|=  2.51413D-03

At iterate   35    f=  4.06818D+00    |proj g|=  1.65259D-03

At iterate   40    f=  4.06809D+00    |proj g|=  5.78944D-04

At iterate   45    f=  4.06805D+00    |proj g|=  1.51754D-03

At iterate   50    f=  4.06802D+00    |proj g|=  6.64594D-04

At iterate   55    f=  4.06799D+00    |proj g|=  1.04972D-03

At iterate   60    f=  4.06797D+00    |proj g|=  1.89929D-03

At iterate   65    f=  4.06796D+00    |proj g|=  4.11686D-04
  ys=-2.963E-06  -gs= 1.165E-05 BFGS update SKIPPED
 Warning:  more than 10 function and gradient
   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     68    112      1     1     0   1.026D-03   4.068D+00
  F =   4.0679430399556331     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.69773D+02    |proj g|=  6.23723D+02

At iterate    5    f=  2.46995D+01    |proj g|=  1.24961D+01

At iterate   10    f=  5.95887D+00    |proj g|=  1.95242D-03

At iterate   15    f=  5.95877D+00    |proj g|=  1.00700D-04

At iterate   20    f=  5.95876D+00    |proj g|=  1.20276D-03

At iterate   25    f=  5.95854D+00    |proj g|=  3.09412D-03

At iterate   30    f=  5.95801D+00    |proj g|=  7.64529D-03

At iterate   35    f=  5.95658D+00    |proj g|=  1.38176D-02

At iterate   40    f=  5.95491D+00    |proj g|=  3.21875D-03

At iterate   45    f=  5.95403D+00    |proj g|=  2.07953D-03

At iterate   50    f=  5.95375D+00    |proj g|=  2.44052D-03

At iterate   55    f=  5.95363D+00    |proj g|=  1.89856D-04

At iterate   60    f=  5.95361D+00    |proj g|=  1.02848D-04

At iterate   65    f=  5.95360D+00    |proj g|=  2.84786D-04

At iterate   70    f=  5.95360D+00    |proj g|=  1.56111D-04

At iterate   75    f=  5.95360D+00    |proj g|=  2.19573D-04

At iterate   80    f=  5.95359D+00    |proj g|=  2.15312D-04

At iterate   85    f=  5.95359D+00    |proj g|=  5.77161D-05

At iterate   90    f=  5.95359D+00    |proj g|=  9.01650D-05

At iterate   95    f=  5.95359D+00    |proj g|=  8.54970D-05

At iterate  100    f=  5.95359D+00    |proj g|=  8.27398D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3    101    136      1     0     0   4.059D-05   5.954D+00
  F =   5.9535919607194820     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.65618D+00    |proj g|=  1.27570D-01

At iterate    5    f=  2.57242D+00    |proj g|=  5.58061D-03

At iterate   10    f=  2.55229D+00    |proj g|=  3.67154D-04
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   15    f=  2.54824D+00    |proj g|=  1.69639D-02

At iterate   20    f=  2.54621D+00    |proj g|=  8.16768D-03

At iterate   25    f=  2.54548D+00    |proj g|=  3.68316D-03

At iterate   30    f=  2.54500D+00    |proj g|=  1.71545D-03

At iterate   35    f=  2.54483D+00    |proj g|=  7.10575D-04

At iterate   40    f=  2.54468D+00    |proj g|=  5.98444D-04

At iterate   45    f=  2.54460D+00    |proj g|=  4.04889D-04

At iterate   50    f=  2.54454D+00    |proj g|=  5.84564D-04

At iterate   55    f=  2.54451D+00    |proj g|=  6.21960D-04

At iterate   60    f=  2.54449D+00    |proj g|=  1.94430D-04

At iterate   65    f=  2.54447D+00    |proj g|=  4.43704D-04

At iterate   70    f=  2.54446D+00    |proj g|=  4.89279D-05

At iterate   75    f=  2.54446D+00    |proj g|=  4.24565D-04

At iterate   80    f=  2.54445D+00    |proj g|=  4.14307D-04

At iterate   85    f=  2.54444D+00    |proj g|=  3.76545D-04

At iterate   90    f=  2.54444D+00    |proj g|=  4.89133D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     92    139      1     0     0   4.209D-04   2.544D+00
  F =   2.5444404779300855     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  6.78443D+01    |proj g|=  1.14585D+02

At iterate    5    f=  3.96317D+00    |proj g|=  1.45041D+00

At iterate   10    f=  2.91126D+00    |proj g|=  8.93594D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   15    f=  2.80325D+00    |proj g|=  1.30503D-02

At iterate   20    f=  2.80064D+00    |proj g|=  1.81898D-02

At iterate   25    f=  2.79852D+00    |proj g|=  9.68660D-04

At iterate   30    f=  2.79776D+00    |proj g|=  4.98899D-03

At iterate   35    f=  2.79727D+00    |proj g|=  2.61200D-03

At iterate   40    f=  2.79703D+00    |proj g|=  1.46278D-03

At iterate   45    f=  2.79683D+00    |proj g|=  2.21193D-03

At iterate   50    f=  2.79672D+00    |proj g|=  1.31235D-03

At iterate   55    f=  2.79666D+00    |proj g|=  3.06447D-03

At iterate   60    f=  2.79661D+00    |proj g|=  4.16839D-03

At iterate   65    f=  2.79657D+00    |proj g|=  1.23235D-03

At iterate   70    f=  2.79654D+00    |proj g|=  2.00222D-03

At iterate   75    f=  2.79652D+00    |proj g|=  2.20126D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     78    117      1     0     0   1.017D-03   2.797D+00
  F =   2.7965127108395844     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.04155D+02    |proj g|=  9.41349D+02

At iterate    5    f=  3.14990D+01    |proj g|=  1.86921D+01

At iterate   10    f=  5.55181D+00    |proj g|=  6.81275D-02

At iterate   15    f=  5.38936D+00    |proj g|=  5.44086D-02

At iterate   20    f=  5.38632D+00    |proj g|=  2.07245D-04

At iterate   25    f=  5.38626D+00    |proj g|=  6.78624D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   30    f=  5.38585D+00    |proj g|=  7.64542D-03

At iterate   35    f=  5.38550D+00    |proj g|=  1.31880D-03

At iterate   40    f=  5.38526D+00    |proj g|=  4.76120D-04

At iterate   45    f=  5.38519D+00    |proj g|=  1.41514D-04

At iterate   50    f=  5.38515D+00    |proj g|=  5.37921D-04

At iterate   55    f=  5.38514D+00    |proj g|=  6.12950D-04

At iterate   60    f=  5.38513D+00    |proj g|=  4.76880D-04

At iterate   65    f=  5.38512D+00    |proj g|=  9.31264D-04

At iterate   70    f=  5.38512D+00    |proj g|=  9.20807D-05

At iterate   75    f=  5.38512D+00    |proj g|=  9.12393D-05

At iterate   80    f=  5.38512D+00    |proj g|=  1.32306D-04

At iterate   85    f=  5.38512D+00    |proj g|=  7.19379D-05

At iterate   90    f=  5.38512D+00    |proj g|=  3.84457D-05

At iterate   95    f=  5.38512D+00    |proj g|=  1.04250D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     97    118      1     0     0   7.978D-06   5.385D+00
  F =   5.3851154454994310     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.35196D+00    |proj g|=  3.98907D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  3.34974D+00    |proj g|=  9.40384D-03

At iterate   10    f=  3.34860D+00    |proj g|=  1.70275D-03

At iterate   15    f=  3.34464D+00    |proj g|=  1.53073D-02

At iterate   20    f=  3.34304D+00    |proj g|=  1.32738D-03

At iterate   25    f=  3.34227D+00    |proj g|=  1.67828D-03

At iterate   30    f=  3.34199D+00    |proj g|=  3.07570D-03

At iterate   35    f=  3.34177D+00    |proj g|=  2.37910D-03

At iterate   40    f=  3.34163D+00    |proj g|=  2.84265D-03

At iterate   45    f=  3.34156D+00    |proj g|=  1.82287D-03

At iterate   50    f=  3.34148D+00    |proj g|=  1.98074D-03

At iterate   55    f=  3.34144D+00    |proj g|=  1.76699D-04

At iterate   60    f=  3.34141D+00    |proj g|=  1.34366D-03

At iterate   65    f=  3.34138D+00    |proj g|=  1.18010D-03

At iterate   70    f=  3.34136D+00    |proj g|=  4.73141D-05

At iterate   75    f=  3.34135D+00    |proj g|=  4.46454D-05

At iterate   80    f=  3.34135D+00    |proj g|=  9.38869D-04

At iterate   85    f=  3.34135D+00    |proj g|=  4.66333D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     89    150      1     0     0   1.097D-03   3.341D+00
  F =   3.3413490530213310     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  9.49538D+01    |proj g|=  1.39059D+02

At iterate    5    f=  7.75083D+00    |proj g|=  2.34541D+00

At iterate   10    f=  5.22328D+00    |proj g|=  1.32741D-01

At iterate   15    f=  5.10617D+00    |proj g|=  4.69147D-03

At iterate   20    f=  5.10224D+00    |proj g|=  1.52406D-03
 Warning:  more than 10 function and gradient
   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   25    f=  5.10053D+00    |proj g|=  1.10280D-02

At iterate   30    f=  5.09924D+00    |proj g|=  5.46524D-03

At iterate   35    f=  5.09856D+00    |proj g|=  5.21172D-03

At iterate   40    f=  5.09821D+00    |proj g|=  6.03800D-03

At iterate   45    f=  5.09785D+00    |proj g|=  2.26799D-03

At iterate   50    f=  5.09763D+00    |proj g|=  7.61686D-04

At iterate   55    f=  5.09752D+00    |proj g|=  2.52464D-03

At iterate   60    f=  5.09741D+00    |proj g|=  4.79408D-04

At iterate   65    f=  5.09735D+00    |proj g|=  1.32530D-03

At iterate   70    f=  5.09731D+00    |proj g|=  3.75368D-04

At iterate   75    f=  5.09728D+00    |proj g|=  1.07853D-03

At iterate   80    f=  5.09726D+00    |proj g|=  1.06590D-03

At iterate   85    f=  5.09724D+00    |proj g|=  9.98007D-04

At iterate   90    f=  5.09723D+00    |proj g|=  6.88604D-04

At iterate   95    f=  5.09722D+00    |proj g|=  1.41500D-03

At iterate  100    f=  5.09721D+00    |proj g|=  5.67384D-04

At iterate  105    f=  5.09720D+00    |proj g|=  4.26172D-04

At iterate  110    f=  5.09720D+00    |proj g|=  7.87933D-04

At iterate  115    f=  5.09720D+00    |proj g|=  8.78711D-05

At iterate  120    f=  5.09720D+00    |proj g|=  2.30035D-04

At iterate  125    f=  5.09719D+00    |proj g|=  4.60423D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3    126    175      1     0     0   4.905D-04   5.097D+00
  F =   5.0971929549160873     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.68368D+00    |proj g|=  3.02526D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  3.68019D+00    |proj g|=  1.18977D-02

At iterate   10    f=  3.67738D+00    |proj g|=  1.01040D-02

At iterate   15    f=  3.67662D+00    |proj g|=  5.56954D-03

At iterate   20    f=  3.67623D+00    |proj g|=  2.54302D-03

At iterate   25    f=  3.67601D+00    |proj g|=  2.23551D-04

At iterate   30    f=  3.67588D+00    |proj g|=  2.42024D-03

At iterate   35    f=  3.67578D+00    |proj g|=  1.69412D-04

At iterate   40    f=  3.67572D+00    |proj g|=  1.36104D-03

At iterate   45    f=  3.67568D+00    |proj g|=  4.53004D-04

At iterate   50    f=  3.67564D+00    |proj g|=  8.59266D-05

At iterate   55    f=  3.67562D+00    |proj g|=  3.85395D-04

At iterate   60    f=  3.67561D+00    |proj g|=  1.67891D-04

At iterate   65    f=  3.67559D+00    |proj g|=  1.37073D-03

At iterate   70    f=  3.67558D+00    |proj g|=  5.42545D-04

At iterate   75    f=  3.67557D+00    |proj g|=  6.78784D-04

At iterate   80    f=  3.67557D+00    |proj g|=  1.50154D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     82    114      1     0     0   6.061D-05   3.676D+00
  F =   3.6755685657861639     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  6.20715D+01    |proj g|=  1.09836D+02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  4.76702D+00    |proj g|=  1.21113D+00

At iterate   10    f=  3.96528D+00    |proj g|=  3.82099D-02

At iterate   15    f=  3.92289D+00    |proj g|=  8.89267D-04

At iterate   20    f=  3.92227D+00    |proj g|=  1.49822D-03

At iterate   25    f=  3.91906D+00    |proj g|=  6.43091D-03

At iterate   30    f=  3.91809D+00    |proj g|=  2.42065D-03

At iterate   35    f=  3.91785D+00    |proj g|=  1.44937D-03

At iterate   40    f=  3.91772D+00    |proj g|=  1.03652D-03

At iterate   45    f=  3.91766D+00    |proj g|=  6.41039D-04

At iterate   50    f=  3.91762D+00    |proj g|=  9.73042D-04

At iterate   55    f=  3.91760D+00    |proj g|=  4.54332D-04

At iterate   60    f=  3.91758D+00    |proj g|=  4.88746D-04

At iterate   65    f=  3.91757D+00    |proj g|=  4.35072D-04

At iterate   70    f=  3.91756D+00    |proj g|=  2.83401D-04

At iterate   75    f=  3.91755D+00    |proj g|=  4.83978D-04

At iterate   80    f=  3.91755D+00    |proj g|=  4.85656D-04

At iterate   85    f=  3.91754D+00    |proj g|=  5.66959D-04

At iterate   90    f=  3.91754D+00    |proj g|=  4.78181D-04

At iterate   95    f=  3.91754D+00    |proj g|=  7.80856D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     96    149      1     0     0   7.811D-05   3.918D+00
  F =   3.9175355525457807     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.16858D+00    |proj g|=  1.97807D+00

At iterate    5    f=  4.26548D+00    |proj g|=  2.31402D-01

At iterate   10    f=  4.11725D+00    |proj g|=  5.41972D-03

At iterate   15    f=  4.11698D+00    |proj g|=  1.08995D-02

At iterate   20    f=  4.11661D+00    |proj g|=  2.54761D-03

At iterate   25    f=  4.11638D+00    |proj g|=  4.04825D-03

At iterate   30    f=  4.11621D+00    |proj g|=  1.56037D-03

At iterate   35    f=  4.11607D+00    |proj g|=  3.13383D-03

At iterate   40    f=  4.11600D+00    |proj g|=  1.14736D-03

At iterate   45    f=  4.11596D+00    |proj g|=  5.41345D-04

At iterate   50    f=  4.11593D+00    |proj g|=  2.23095D-03

At iterate   55    f=  4.11590D+00    |proj g|=  4.08937D-04

At iterate   60    f=  4.11588D+00    |proj g|=  7.53978D-04

At iterate   65    f=  4.11586D+00    |proj g|=  2.10471D-03

At iterate   70    f=  4.11585D+00    |proj g|=  1.29737D-03

At iterate   75    f=  4.11584D+00    |proj g|=  4.93776D-04

At iterate   80    f=  4.11583D+00    |proj g|=  1.75496D-03

At iterate   85    f=  4.11582D+00    |proj g|=  3.04625D-03

At iterate   90    f=  4.11582D+00    |proj g|=  1.90427D-03

At iterate   95    f=  4.11582D+00    |proj g|=  6.99148D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     95    136      1     0     0   6.991D-04   4.116D+00
  F =   4.1158160005995370     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  8.10660D+01    |proj g|=  1.45001D+02

At iterate    5    f=  5.39290D+00    |proj g|=  1.59996D+00

At iterate   10    f=  4.33477D+00    |proj g|=  4.53752D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   15    f=  4.28201D+00    |proj g|=  5.17062D-03

At iterate   20    f=  4.28053D+00    |proj g|=  2.83765D-03

At iterate   25    f=  4.27574D+00    |proj g|=  1.25576D-02

At iterate   30    f=  4.27440D+00    |proj g|=  2.91812D-03

At iterate   35    f=  4.27386D+00    |proj g|=  3.22018D-03

At iterate   40    f=  4.27358D+00    |proj g|=  1.99545D-03

At iterate   45    f=  4.27334D+00    |proj g|=  6.52239D-04

At iterate   50    f=  4.27321D+00    |proj g|=  2.02047D-03

At iterate   55    f=  4.27311D+00    |proj g|=  8.25125D-04

At iterate   60    f=  4.27303D+00    |proj g|=  2.55706D-04

At iterate   65    f=  4.27299D+00    |proj g|=  9.43770D-04

At iterate   70    f=  4.27293D+00    |proj g|=  1.68363D-04

At iterate   75    f=  4.27292D+00    |proj g|=  5.28090D-04

At iterate   80    f=  4.27291D+00    |proj g|=  7.72064D-04

At iterate   85    f=  4.27290D+00    |proj g|=  9.87360D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     89    128      1     0     0   1.488D-04   4.273D+00
  F =   4.2728980769687723     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.57669D+00    |proj g|=  4.50345D-02

At iterate    5    f=  4.57190D+00    |proj g|=  4.03412D-03

At iterate   10    f=  4.57129D+00    |proj g|=  1.45078D-04

At iterate   15    f=  4.57126D+00    |proj g|=  9.64005D-04

At iterate   20    f=  4.57120D+00    |proj g|=  2.34531D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     21     23      1     0     0   3.241D-06   4.571D+00
  F =   4.5711958996641728     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.42512D+00    |proj g|=  6.02833D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  3.41852D+00    |proj g|=  2.19986D-02

At iterate   10    f=  3.41482D+00    |proj g|=  1.34683D-02

At iterate   15    f=  3.41279D+00    |proj g|=  8.58542D-03

At iterate   20    f=  3.41195D+00    |proj g|=  1.29147D-03

At iterate   25    f=  3.41156D+00    |proj g|=  2.60265D-03

At iterate   30    f=  3.41129D+00    |proj g|=  2.63472D-03

At iterate   35    f=  3.41111D+00    |proj g|=  2.84149D-04

At iterate   40    f=  3.41102D+00    |proj g|=  2.50923D-03

At iterate   45    f=  3.41095D+00    |proj g|=  3.05210D-03

At iterate   50    f=  3.41090D+00    |proj g|=  4.79466D-04

At iterate   55    f=  3.41087D+00    |proj g|=  2.82942D-03

At iterate   60    f=  3.41085D+00    |proj g|=  3.01388D-03

At iterate   65    f=  3.41083D+00    |proj g|=  7.32545D-04

At iterate   70    f=  3.41082D+00    |proj g|=  1.63336D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     71    103      1     0     0   1.635D-04   3.411D+00
  F =   3.4108239535894018     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.01383D+01    |proj g|=  4.19263D+01

At iterate    5    f=  3.56472D+00    |proj g|=  9.10857D-01

At iterate   10    f=  2.64378D+00    |proj g|=  5.39571D-02

At iterate   15    f=  2.57051D+00    |proj g|=  8.08498D-03

At iterate   20    f=  2.56324D+00    |proj g|=  5.80993D-05

At iterate   25    f=  2.56315D+00    |proj g|=  6.32907D-03

At iterate   30    f=  2.56247D+00    |proj g|=  5.60720D-03

At iterate   35    f=  2.56206D+00    |proj g|=  1.42389D-04

At iterate   40    f=  2.56179D+00    |proj g|=  2.62899D-03

At iterate   45    f=  2.56164D+00    |proj g|=  2.19265D-03

At iterate   50    f=  2.56154D+00    |proj g|=  1.23088D-03

At iterate   55    f=  2.56148D+00    |proj g|=  9.52167D-04

At iterate   60    f=  2.56143D+00    |proj g|=  6.03186D-04

At iterate   65    f=  2.56140D+00    |proj g|=  4.77158D-04

At iterate   70    f=  2.56138D+00    |proj g|=  2.05991D-04

At iterate   75    f=  2.56137D+00    |proj g|=  2.83276D-04

At iterate   80    f=  2.56135D+00    |proj g|=  2.03325D-04

At iterate   85    f=  2.56134D+00    |proj g|=  1.07234D-04

At iterate   90    f=  2.56133D+00    |proj g|=  7.48320D-05

At iterate   95    f=  2.56133D+00    |proj g|=  2.77312D-04

At iterate  100    f=  2.56133D+00    |proj g|=  3.60252D-04

At iterate  105    f=  2.56132D+00    |proj g|=  1.29056D-04

At iterate  110    f=  2.56132D+00    |proj g|=  1.22008D-04

At iterate  115    f=  2.56132D+00    |proj g|=  8.25879D-05

At iterate  120    f=  2.56132D+00    |proj g|=  1.50566D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3    121    169      1     0     0   1.259D-05   2.561D+00
  F =   2.5613196418365254     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.83318D+00    |proj g|=  2.97996D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  3.82789D+00    |proj g|=  1.37461D-03

At iterate   10    f=  3.82270D+00    |proj g|=  7.44208D-03

At iterate   15    f=  3.82165D+00    |proj g|=  8.39603D-03

At iterate   20    f=  3.82120D+00    |proj g|=  4.44346D-03

At iterate   25    f=  3.82091D+00    |proj g|=  2.97417D-03

At iterate   30    f=  3.82063D+00    |proj g|=  3.61730D-04

At iterate   35    f=  3.82058D+00    |proj g|=  6.49648D-04

At iterate   40    f=  3.82050D+00    |proj g|=  7.08846D-04

At iterate   45    f=  3.82046D+00    |proj g|=  1.62739D-03

At iterate   50    f=  3.82042D+00    |proj g|=  4.28572D-04

At iterate   55    f=  3.82040D+00    |proj g|=  5.42452D-04

At iterate   60    f=  3.82038D+00    |proj g|=  8.83560D-04

At iterate   65    f=  3.82037D+00    |proj g|=  1.15516D-03
  ys=-3.289E-06  -gs= 4.945E-06 BFGS update SKIPPED

At iterate   70    f=  3.82035D+00    |proj g|=  1.02333D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     70    110      1     1     0   1.023D-03   3.820D+00
  F =   3.8203534122533074     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
 Warning:  more than 10 function and gradient
   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.32686D+00    |proj g|=  1.22744D+00

At iterate    5    f=  2.49366D+00    |proj g|=  8.36666D-03

At iterate   10    f=  2.48713D+00    |proj g|=  1.76078D-02

At iterate   15    f=  2.43809D+00    |proj g|=  9.20028D-03

At iterate   20    f=  2.43054D+00    |proj g|=  7.15372D-03

At iterate   25    f=  2.42876D+00    |proj g|=  2.00883D-03

At iterate   30    f=  2.42873D+00    |proj g|=  4.80388D-04

At iterate   35    f=  2.42866D+00    |proj g|=  1.12787D-03

At iterate   40    f=  2.42851D+00    |proj g|=  5.96876D-04

At iterate   45    f=  2.42839D+00    |proj g|=  4.86290D-03

At iterate   50    f=  2.42760D+00    |proj g|=  2.33467D-03

At iterate   55    f=  2.42726D+00    |proj g|=  1.75035D-03

At iterate   60    f=  2.42695D+00    |proj g|=  8.66457D-04

At iterate   65    f=  2.42668D+00    |proj g|=  2.35432D-03

At iterate   70    f=  2.42661D+00    |proj g|=  2.00997D-03

At iterate   75    f=  2.42658D+00    |proj g|=  6.57079D-04

At iterate   80    f=  2.42655D+00    |proj g|=  6.08744D-04

At iterate   85    f=  2.42654D+00    |proj g|=  5.32340D-04

At iterate   90    f=  2.42653D+00    |proj g|=  5.57817D-04

At iterate   95    f=  2.42652D+00    |proj g|=  9.94275D-05

At iterate  100    f=  2.42652D+00    |proj g|=  2.21765D-04

At iterate  105    f=  2.42651D+00    |proj g|=  1.53955D-04

At iterate  110    f=  2.42651D+00    |proj g|=  1.37104D-04

At iterate  115    f=  2.42650D+00    |proj g|=  1.33144D-04

At iterate  120    f=  2.42650D+00    |proj g|=  8.71432D-05

At iterate  125    f=  2.42650D+00    |proj g|=  1.15227D-04

At iterate  130    f=  2.42650D+00    |proj g|=  9.02935D-05

At iterate  135    f=  2.42650D+00    |proj g|=  2.31996D-05

At iterate  140    f=  2.42650D+00    |proj g|=  3.77955D-05

At iterate  145    f=  2.42649D+00    |proj g|=  9.70526D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3    145    198      1     0     0   9.705D-06   2.426D+00
  F =   2.4264948977275083     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.15534D+01    |proj g|=  7.01247D+01

At iterate    5    f=  4.66287D+00    |proj g|=  7.88482D-01

At iterate   10    f=  4.13454D+00    |proj g|=  2.56913D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   15    f=  4.11075D+00    |proj g|=  9.84672D-04

At iterate   20    f=  4.10369D+00    |proj g|=  9.08759D-03

At iterate   25    f=  4.10230D+00    |proj g|=  1.70417D-02

At iterate   30    f=  4.10056D+00    |proj g|=  1.18233D-02

At iterate   35    f=  4.09968D+00    |proj g|=  2.72805D-03

At iterate   40    f=  4.09920D+00    |proj g|=  1.64318D-03

At iterate   45    f=  4.09895D+00    |proj g|=  1.33829D-03

At iterate   50    f=  4.09879D+00    |proj g|=  1.55293D-03

At iterate   55    f=  4.09872D+00    |proj g|=  4.77325D-04

At iterate   60    f=  4.09865D+00    |proj g|=  8.10765D-04

At iterate   65    f=  4.09861D+00    |proj g|=  1.96979D-03

At iterate   70    f=  4.09858D+00    |proj g|=  3.05369D-04

At iterate   75    f=  4.09858D+00    |proj g|=  1.92422D-04

At iterate   80    f=  4.09856D+00    |proj g|=  5.42177D-04

At iterate   85    f=  4.09855D+00    |proj g|=  1.15716D-04

At iterate   90    f=  4.09855D+00    |proj g|=  7.79073D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     91    123      1     0     0   8.694D-05   4.099D+00
  F =   4.0985530716501133     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.16976D+02    |proj g|=  2.11130D+02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  5.79627D+00    |proj g|=  2.37894D+00

At iterate   10    f=  4.22858D+00    |proj g|=  1.04436D-01

At iterate   15    f=  4.09658D+00    |proj g|=  8.83652D-03

At iterate   20    f=  4.08949D+00    |proj g|=  3.13781D-04

At iterate   25    f=  4.08628D+00    |proj g|=  3.33285D-03

At iterate   30    f=  4.08397D+00    |proj g|=  1.62663D-02

At iterate   35    f=  4.08141D+00    |proj g|=  6.29673D-03

At iterate   40    f=  4.08072D+00    |proj g|=  3.43523D-03

At iterate   45    f=  4.08014D+00    |proj g|=  2.61900D-03

At iterate   50    f=  4.07982D+00    |proj g|=  2.28952D-03

At iterate   55    f=  4.07961D+00    |proj g|=  1.21863D-03

At iterate   60    f=  4.07946D+00    |proj g|=  1.64838D-04

At iterate   65    f=  4.07939D+00    |proj g|=  1.06179D-03

At iterate   70    f=  4.07934D+00    |proj g|=  1.72140D-03

At iterate   75    f=  4.07929D+00    |proj g|=  7.90391D-04

At iterate   80    f=  4.07928D+00    |proj g|=  7.10503D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     81    125      1     0     0   7.271D-05   4.079D+00
  F =   4.0792834726563214     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.89435D+00    |proj g|=  6.39360D-03

At iterate    5    f=  3.89374D+00    |proj g|=  1.91600D-04

At iterate   10    f=  3.89345D+00    |proj g|=  6.54781D-03

At iterate   15    f=  3.89158D+00    |proj g|=  3.08037D-03

At iterate   20    f=  3.88522D+00    |proj g|=  1.48247D-02

At iterate   25    f=  3.88203D+00    |proj g|=  1.29497D-02

At iterate   30    f=  3.88042D+00    |proj g|=  1.01437D-03

At iterate   35    f=  3.87995D+00    |proj g|=  2.89005D-03

At iterate   40    f=  3.87955D+00    |proj g|=  4.72081D-04

At iterate   45    f=  3.87936D+00    |proj g|=  1.93106D-03

At iterate   50    f=  3.87922D+00    |proj g|=  2.48155D-03

At iterate   55    f=  3.87912D+00    |proj g|=  2.02441D-03

At iterate   60    f=  3.87905D+00    |proj g|=  1.14492D-03

At iterate   65    f=  3.87900D+00    |proj g|=  3.12532D-03
  ys=-8.485E-07  -gs= 1.307E-05 BFGS update SKIPPED
 Warning:  more than 10 function and gradient
   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
At iterate   70    f=  3.87896D+00    |proj g|=  1.97216D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     70    122      1     1     0   1.972D-03   3.879D+00
  F =   3.8789609830274583     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.38113D+00    |proj g|=  8.30584D-02

At iterate    5    f=  4.37244D+00    |proj g|=  7.56896D-03

At iterate   10    f=  4.37231D+00    |proj g|=  1.05251D-04

At iterate   15    f=  4.37226D+00    |proj g|=  1.33855D-03

At iterate   20    f=  4.37225D+00    |proj g|=  1.93614D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     20     23      1     0     0   1.936D-06   4.372D+00
  F =   4.3722504823540262     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.49612D+01    |proj g|=  1.33141D+02

At iterate    5    f=  5.34291D+00    |proj g|=  1.47421D+00

At iterate   10    f=  4.37104D+00    |proj g|=  4.35772D-02

At iterate   15    f=  4.32408D+00    |proj g|=  1.80230D-03

At iterate   20    f=  4.32288D+00    |proj g|=  4.70417D-04

At iterate   25    f=  4.31610D+00    |proj g|=  1.80619D-02

At iterate   30    f=  4.31473D+00    |proj g|=  7.80008D-03

At iterate   35    f=  4.31377D+00    |proj g|=  1.50816D-03

At iterate   40    f=  4.31330D+00    |proj g|=  1.00312D-03

At iterate   45    f=  4.31299D+00    |proj g|=  9.84102D-04

At iterate   50    f=  4.31282D+00    |proj g|=  1.81930D-03

At iterate   55    f=  4.31267D+00    |proj g|=  1.19754D-03

At iterate   60    f=  4.31260D+00    |proj g|=  1.23698D-03

At iterate   65    f=  4.31255D+00    |proj g|=  9.05835D-04

At iterate   70    f=  4.31252D+00    |proj g|=  5.31496D-04

At iterate   75    f=  4.31252D+00    |proj g|=  4.41287D-05

At iterate   80    f=  4.31248D+00    |proj g|=  2.85425D-04

At iterate   85    f=  4.31248D+00    |proj g|=  1.19570D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     87    138      1     0     0   1.473D-03   4.312D+00
  F =   4.3124693721475928     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.21619D+00    |proj g|=  2.30242D-02

At iterate    5    f=  4.21124D+00    |proj g|=  1.18340D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   10    f=  4.20192D+00    |proj g|=  1.43027D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     16      1     0     0   4.513D-06   4.202D+00
  F =   4.2019095223167700     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.12053D+00    |proj g|=  2.61566D-02

At iterate    5    f=  5.11506D+00    |proj g|=  1.31990D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   10    f=  5.10457D+00    |proj g|=  6.22997D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     17      1     0     0   1.134D-05   5.105D+00
  F =   5.1045696148239452     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  8.73297D+01    |proj g|=  1.56774D+02

At iterate    5    f=  5.45274D+00    |proj g|=  1.73421D+00
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   10    f=  4.30512D+00    |proj g|=  5.29725D-02

At iterate   15    f=  4.23749D+00    |proj g|=  2.72965D-03

At iterate   20    f=  4.23491D+00    |proj g|=  6.87179D-04

At iterate   25    f=  4.23043D+00    |proj g|=  1.76867D-02

At iterate   30    f=  4.22814D+00    |proj g|=  1.09491D-02

At iterate   35    f=  4.22670D+00    |proj g|=  6.36254D-03

At iterate   40    f=  4.22603D+00    |proj g|=  5.66897D-03

At iterate   45    f=  4.22567D+00    |proj g|=  1.15772D-03

At iterate   50    f=  4.22536D+00    |proj g|=  1.96170D-03

At iterate   55    f=  4.22514D+00    |proj g|=  5.26946D-04

At iterate   60    f=  4.22500D+00    |proj g|=  7.86514D-05

At iterate   65    f=  4.22492D+00    |proj g|=  4.65191D-04

At iterate   70    f=  4.22487D+00    |proj g|=  7.24695D-04

At iterate   75    f=  4.22483D+00    |proj g|=  4.43895D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     76    109      1     0     0   4.434D-04   4.225D+00
  F =   4.2248303260153666     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.43521D+00    |proj g|=  2.26182D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  4.43163D+00    |proj g|=  9.17578D-03

At iterate   10    f=  4.42037D+00    |proj g|=  1.77180D-02

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     18      1     0     0   7.943D-06   4.420D+00
  F =   4.4197781103324125     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.92217D+00    |proj g|=  3.20251D-03
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  3.91922D+00    |proj g|=  3.68872D-03

At iterate   10    f=  3.91334D+00    |proj g|=  1.09055D-02

At iterate   15    f=  3.91149D+00    |proj g|=  1.17354D-02

At iterate   20    f=  3.91045D+00    |proj g|=  2.24950D-03

At iterate   25    f=  3.91000D+00    |proj g|=  3.06184D-03

At iterate   30    f=  3.90970D+00    |proj g|=  6.08297D-04

At iterate   35    f=  3.90955D+00    |proj g|=  1.32218D-03

At iterate   40    f=  3.90946D+00    |proj g|=  2.25080D-03

At iterate   45    f=  3.90937D+00    |proj g|=  4.54703D-04

At iterate   50    f=  3.90933D+00    |proj g|=  1.17167D-03

At iterate   55    f=  3.90930D+00    |proj g|=  1.45576D-03

At iterate   60    f=  3.90928D+00    |proj g|=  6.08738D-05

At iterate   65    f=  3.90926D+00    |proj g|=  6.43833D-04

At iterate   70    f=  3.90924D+00    |proj g|=  1.15238D-03

At iterate   75    f=  3.90924D+00    |proj g|=  2.14283D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     77    120      1     0     0   5.760D-05   3.909D+00
  F =   3.9092399282222541     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.42310D+00    |proj g|=  5.97693D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
At iterate    5    f=  4.41950D+00    |proj g|=  5.82638D-03

At iterate   10    f=  4.41903D+00    |proj g|=  7.99124D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     16      1     0     0   7.410D-06   4.419D+00
  F =   4.4189211014052505     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.33473D+00    |proj g|=  5.77322D-02

At iterate    5    f=  4.56843D+00    |proj g|=  2.25295D-03

At iterate   10    f=  4.56818D+00    |proj g|=  4.62995D-04

At iterate   15    f=  4.56794D+00    |proj g|=  1.22229D-02

At iterate   20    f=  4.56762D+00    |proj g|=  1.21479D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     21     31      1     0     0   1.460D-06   4.568D+00
  F =   4.5676231793925659     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.80401D+00    |proj g|=  1.58686D-02

At iterate    5    f=  3.79877D+00    |proj g|=  1.35691D-02

At iterate   10    f=  3.79715D+00    |proj g|=  7.00547D-04

At iterate   15    f=  3.79698D+00    |proj g|=  1.07338D-02

At iterate   20    f=  3.79432D+00    |proj g|=  2.15032D-03

At iterate   25    f=  3.78917D+00    |proj g|=  1.31732D-02

At iterate   30    f=  3.78695D+00    |proj g|=  2.16791D-03

At iterate   35    f=  3.78623D+00    |proj g|=  5.49078D-03

At iterate   40    f=  3.78583D+00    |proj g|=  3.79400D-03

At iterate   45    f=  3.78563D+00    |proj g|=  1.31763D-03

At iterate   50    f=  3.78544D+00    |proj g|=  1.06187D-03

At iterate   55    f=  3.78534D+00    |proj g|=  1.16417D-03

At iterate   60    f=  3.78526D+00    |proj g|=  1.38019D-04

At iterate   65    f=  3.78522D+00    |proj g|=  7.81101D-04

At iterate   70    f=  3.78518D+00    |proj g|=  7.68870D-04

At iterate   75    f=  3.78516D+00    |proj g|=  1.51874D-03

At iterate   80    f=  3.78514D+00    |proj g|=  1.12795D-03

At iterate   85    f=  3.78512D+00    |proj g|=  3.34436D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     86    129      1     0     0   3.108D-04   3.785D+00
  F =   3.7851207747049926     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.61810D+00    |proj g|=  2.79318D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  4.60997D+00    |proj g|=  2.47069D-02

At iterate   10    f=  4.59585D+00    |proj g|=  9.70543D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     16      1     0     0   1.085D-05   4.596D+00
  F =   4.5958370507365967     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  8.92822D+01    |proj g|=  1.65794D+02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate    5    f=  5.72599D+00    |proj g|=  1.68948D+00

At iterate   10    f=  4.63235D+00    |proj g|=  4.74004D-02

At iterate   15    f=  4.58294D+00    |proj g|=  1.02547D-03

At iterate   20    f=  4.58233D+00    |proj g|=  2.44622D-03

At iterate   25    f=  4.58047D+00    |proj g|=  3.36567D-02

At iterate   30    f=  4.57681D+00    |proj g|=  5.31614D-03

At iterate   35    f=  4.57600D+00    |proj g|=  8.03789D-03

At iterate   40    f=  4.57531D+00    |proj g|=  2.14845D-03

At iterate   45    f=  4.57489D+00    |proj g|=  8.31588D-04

At iterate   50    f=  4.57474D+00    |proj g|=  5.87048D-04

At iterate   55    f=  4.57463D+00    |proj g|=  1.78693D-03

At iterate   60    f=  4.57457D+00    |proj g|=  1.39905D-03

At iterate   65    f=  4.57452D+00    |proj g|=  1.37663D-03

At iterate   70    f=  4.57448D+00    |proj g|=  2.26044D-03

At iterate   75    f=  4.57445D+00    |proj g|=  1.51673D-03

At iterate   80    f=  4.57444D+00    |proj g|=  2.45611D-04

At iterate   85    f=  4.57443D+00    |proj g|=  5.41075D-04

At iterate   90    f=  4.57442D+00    |proj g|=  1.96097D-04

At iterate   95    f=  4.57442D+00    |proj g|=  3.09745D-04

At iterate  100    f=  4.57441D+00    |proj g|=  1.58590D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3    100    136      1     0     0   1.586D-04   4.574D+00
  F =   4.5744141274024264     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.07592D+00    |proj g|=  1.12399D-01

At iterate    5    f=  4.06103D+00    |proj g|=  2.24425D-02

At iterate   10    f=  4.05952D+00    |proj g|=  2.78333D-03

At iterate   15    f=  4.05780D+00    |proj g|=  2.17070D-02

At iterate   20    f=  4.05673D+00    |proj g|=  1.44509D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     21     24      1     0     0   5.516D-08   4.057D+00
  F =   4.0567270179395134     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.83755D+00    |proj g|=  2.57634D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
  ys=-2.223E+00  -gs= 3.478E-01 BFGS update SKIPPED

At iterate    5    f=  4.88115D+00    |proj g|=  1.14936D-02

At iterate   10    f=  4.86788D+00    |proj g|=  1.35720D-02

At iterate   15    f=  4.86769D+00    |proj g|=  2.38373D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     17     27      2     1     0   3.481D-06   4.868D+00
  F =   4.8676868576921510     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.44348D+00    |proj g|=  5.00683D-02

At iterate    5    f=  4.58024D+00    |proj g|=  6.73263D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     20      1     0     0   6.837D-07   4.580D+00
  F =   4.5801348733547940     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.25385D+00    |proj g|=  7.72175D-02

At iterate    5    f=  4.24910D+00    |proj g|=  1.26631D-02

At iterate   10    f=  4.24851D+00    |proj g|=  1.59772D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     17      1     0     0   7.499D-07   4.248D+00
  F =   4.2484681712553218     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.93908D+01    |proj g|=  6.99486D+01

At iterate    5    f=  4.33165D+00    |proj g|=  1.14556D+00

At iterate   10    f=  3.38312D+00    |proj g|=  8.00700D-02

At iterate   15    f=  3.31117D+00    |proj g|=  1.87777D-02

At iterate   20    f=  3.30637D+00    |proj g|=  2.37390D-04

At iterate   25    f=  3.30472D+00    |proj g|=  1.34126D-03

At iterate   30    f=  3.30373D+00    |proj g|=  7.50021D-03

At iterate   35    f=  3.30314D+00    |proj g|=  4.05573D-03

At iterate   40    f=  3.30282D+00    |proj g|=  1.66254D-03

At iterate   45    f=  3.30267D+00    |proj g|=  2.54909D-04

At iterate   50    f=  3.30258D+00    |proj g|=  8.94552D-04

At iterate   55    f=  3.30251D+00    |proj g|=  7.15687D-04

At iterate   60    f=  3.30246D+00    |proj g|=  5.61830D-04

At iterate   65    f=  3.30243D+00    |proj g|=  3.35424D-04

At iterate   70    f=  3.30241D+00    |proj g|=  2.79124D-04

At iterate   75    f=  3.30239D+00    |proj g|=  7.84230D-05

At iterate   80    f=  3.30238D+00    |proj g|=  1.55425D-04

At iterate   85    f=  3.30237D+00    |proj g|=  3.31677D-04

At iterate   90    f=  3.30237D+00    |proj g|=  1.89853D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     90    129      1     0     0   1.899D-05   3.302D+00
  F =   3.3023700069972652     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.69231D+01    |proj g|=  2.54543D+01

At iterate    5    f=  3.53303D+00    |proj g|=  2.81178D-01
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
At iterate   10    f=  3.36908D+00    |proj g|=  1.44719D-02

At iterate   15    f=  3.36003D+00    |proj g|=  5.29998D-04

At iterate   20    f=  3.35960D+00    |proj g|=  6.54885D-03

At iterate   25    f=  3.35915D+00    |proj g|=  2.44939D-04

At iterate   30    f=  3.35913D+00    |proj g|=  3.30477D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     33     36      1     0     0   4.502D-06   3.359D+00
  F =   3.3591334227400331     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.97710D+00    |proj g|=  1.35645D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
At iterate    5    f=  3.97468D+00    |proj g|=  9.74497D-04

At iterate   10    f=  3.97466D+00    |proj g|=  1.14123D-04

At iterate   15    f=  3.97465D+00    |proj g|=  1.12493D-03

At iterate   20    f=  3.97462D+00    |proj g|=  2.65138D-04

At iterate   25    f=  3.97462D+00    |proj g|=  2.63150D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     25     30      1     0     0   2.631D-06   3.975D+00
  F =   3.9746173344298215     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.59933D+00    |proj g|=  6.83893D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
At iterate    5    f=  3.78483D+00    |proj g|=  6.51690D-03

At iterate   10    f=  3.78460D+00    |proj g|=  3.22363D-03

At iterate   15    f=  3.78454D+00    |proj g|=  5.28377D-07

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     15     22      1     0     0   5.284D-07   3.785D+00
  F =   3.7845428358416418     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.11629D+00    |proj g|=  7.41764D-02

At iterate    5    f=  4.26166D+00    |proj g|=  1.80350D-02

At iterate   10    f=  4.25959D+00    |proj g|=  2.82948D-03

At iterate   15    f=  4.25879D+00    |proj g|=  1.54660D-02

At iterate   20    f=  4.25837D+00    |proj g|=  1.84359D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     20     28      1     0     0   1.844D-06   4.258D+00
  F =   4.2583733638940835     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.56799D+00    |proj g|=  1.38791D-01

At iterate    5    f=  4.55351D+00    |proj g|=  1.27641D-02

At iterate   10    f=  4.55304D+00    |proj g|=  6.75655D-03

At iterate   15    f=  4.55205D+00    |proj g|=  8.32630D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     18     20      1     0     0   8.573D-06   4.552D+00
  F =   4.5520477664401167     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.96375D+00    |proj g|=  1.04419D-01
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
At iterate    5    f=  4.37448D+00    |proj g|=  1.33401D-03

At iterate   10    f=  4.37417D+00    |proj g|=  1.73543D-03

At iterate   15    f=  4.37393D+00    |proj g|=  1.03399D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     15     20      1     0     0   1.034D-05   4.374D+00
  F =   4.3739272799446169     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.92166D+00    |proj g|=  6.97307D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
At iterate    5    f=  4.95873D+00    |proj g|=  2.12918D-03

At iterate   10    f=  4.95872D+00    |proj g|=  1.45149D-03

At iterate   15    f=  4.95830D+00    |proj g|=  1.94417D-02

At iterate   20    f=  4.95411D+00    |proj g|=  4.50976D-02

At iterate   25    f=  4.95148D+00    |proj g|=  6.19416D-07

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     25     43      1     0     0   6.194D-07   4.951D+00
  F =   4.9514828596343179     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.87896D+00    |proj g|=  5.08558D-03

At iterate    5    f=  4.87858D+00    |proj g|=  6.99130D-04

At iterate   10    f=  4.87629D+00    |proj g|=  2.78950D-02

At iterate   15    f=  4.87509D+00    |proj g|=  3.46835D-03

At iterate   20    f=  4.86670D+00    |proj g|=  2.69496D-02

At iterate   25    f=  4.86146D+00    |proj g|=  1.54112D-02

At iterate   30    f=  4.85896D+00    |proj g|=  8.07244D-03

At iterate   35    f=  4.85748D+00    |proj g|=  6.08517D-03

At iterate   40    f=  4.85625D+00    |proj g|=  2.37752D-03

At iterate   45    f=  4.85561D+00    |proj g|=  5.38876D-04

At iterate   50    f=  4.85524D+00    |proj g|=  3.16096D-03

At iterate   55    f=  4.85499D+00    |proj g|=  3.37832D-03

At iterate   60    f=  4.85482D+00    |proj g|=  1.00207D-03

At iterate   65    f=  4.85468D+00    |proj g|=  3.27625D-04

At iterate   70    f=  4.85462D+00    |proj g|=  2.61759D-03
  ys=-1.584E-05  -gs= 4.195E-05 BFGS update SKIPPED
 Bad direction in the line search;
   refresh the lbfgs memory and restart the iteration.
At iterate   75    f=  4.85458D+00    |proj g|=  5.83098D-04

At iterate   80    f=  4.85457D+00    |proj g|=  1.46963D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     83    142      2     1     0   1.846D-04   4.855D+00
  F =   4.8545658096327440     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.04835D+00    |proj g|=  4.73350D-02

At iterate    5    f=  4.45079D+00    |proj g|=  4.92279D-04

At iterate   10    f=  4.44929D+00    |proj g|=  1.35171D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     24      1     0     0   1.357D-07   4.449D+00
  F =   4.4492803719792393     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.20855D+00    |proj g|=  1.39682D-01

At iterate    5    f=  4.19067D+00    |proj g|=  1.83269D-02

At iterate   10    f=  4.18911D+00    |proj g|=  1.64383D-02

At iterate   15    f=  4.18811D+00    |proj g|=  2.42114D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     16     18      1     0     0   1.672D-05   4.188D+00
  F =   4.1881050647867903     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.05578D+00    |proj g|=  3.78596D-04

At iterate    5    f=  5.05485D+00    |proj g|=  1.64017D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      9     13      1     0     0   2.141D-06   5.055D+00
  F =   5.0547634148666152     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.32428D+00    |proj g|=  5.09594D-02

At iterate    5    f=  4.29776D+00    |proj g|=  7.17842D-04

At iterate   10    f=  4.29625D+00    |proj g|=  7.76417D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     12     20      1     0     0   3.356D-06   4.296D+00
  F =   4.2962468990047959     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.43618D+00    |proj g|=  3.91448D-02

At iterate    5    f=  4.64124D+00    |proj g|=  1.26065D-02

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     16      1     0     0   3.190D-05   4.641D+00
  F =   4.6411049886399400     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.32316D+00    |proj g|=  1.06867D-02

At iterate    5    f=  4.32091D+00    |proj g|=  9.24910D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     11      1     0     0   3.154D-06   4.321D+00
  F =   4.3209072921427758     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.16408D+00    |proj g|=  5.65400D-02

At iterate    5    f=  4.16103D+00    |proj g|=  5.38682D-03

At iterate   10    f=  4.16081D+00    |proj g|=  4.71169D-03

At iterate   15    f=  4.16030D+00    |proj g|=  3.46500D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     16     18      1     0     0   8.525D-06   4.160D+00
  F =   4.1603004704665079     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.71047D+00    |proj g|=  5.14716D-02

At iterate    5    f=  4.09342D+00    |proj g|=  6.18583D-03

At iterate   10    f=  4.09325D+00    |proj g|=  5.73266D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     11     18      1     0     0   1.480D-06   4.093D+00
  F =   4.0932521443062795     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.98208D+00    |proj g|=  3.76896D-02

At iterate    5    f=  4.83523D+00    |proj g|=  3.12129D-04

At iterate   10    f=  4.83502D+00    |proj g|=  3.56719D-03

At iterate   15    f=  4.83432D+00    |proj g|=  7.24780D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     19     29      1     0     0   7.774D-06   4.834D+00
  F =   4.8342611195087226     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  6.00138D+00    |proj g|=  3.71076D-02

At iterate    5    f=  4.77264D+00    |proj g|=  1.37629D-03

At iterate   10    f=  4.77203D+00    |proj g|=  1.87612D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     24      1     0     0   4.466D-06   4.772D+00
  F =   4.7719916210089597     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.80004D+00    |proj g|=  7.50990D-02

At iterate    5    f=  4.93369D+00    |proj g|=  2.08081D-03

At iterate   10    f=  4.93329D+00    |proj g|=  3.85616D-03

At iterate   15    f=  4.93155D+00    |proj g|=  1.17300D-02

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     17     25      1     0     0   3.625D-06   4.931D+00
  F =   4.9314012882483729     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.17058D+00    |proj g|=  1.40069D-02

At iterate    5    f=  4.16818D+00    |proj g|=  6.72409D-03

At iterate   10    f=  4.16581D+00    |proj g|=  6.10670D-04

At iterate   15    f=  4.16580D+00    |proj g|=  5.26960D-04

At iterate   20    f=  4.16557D+00    |proj g|=  5.13037D-03

At iterate   25    f=  4.16352D+00    |proj g|=  8.40465D-03

At iterate   30    f=  4.15619D+00    |proj g|=  1.89344D-02

At iterate   35    f=  4.15372D+00    |proj g|=  9.17683D-03

At iterate   40    f=  4.15251D+00    |proj g|=  7.08334D-03

At iterate   45    f=  4.15181D+00    |proj g|=  3.11880D-03

At iterate   50    f=  4.15144D+00    |proj g|=  3.75651D-03

At iterate   55    f=  4.15119D+00    |proj g|=  2.91515D-04

At iterate   60    f=  4.15099D+00    |proj g|=  1.12097D-03

At iterate   65    f=  4.15088D+00    |proj g|=  4.79565D-04

At iterate   70    f=  4.15082D+00    |proj g|=  1.23394D-03

At iterate   75    f=  4.15077D+00    |proj g|=  1.77173D-03

At iterate   80    f=  4.15075D+00    |proj g|=  2.00265D-03

At iterate   85    f=  4.15073D+00    |proj g|=  1.47618D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     85    130      1     0     0   1.476D-03   4.151D+00
  F =   4.1507278583748342     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.07342D+00    |proj g|=  4.60916D-02

At iterate    5    f=  4.07122D+00    |proj g|=  2.49462D-03

At iterate   10    f=  4.07121D+00    |proj g|=  1.13521D-03

At iterate   15    f=  4.07120D+00    |proj g|=  1.06990D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     15     17      1     0     0   1.070D-06   4.071D+00
  F =   4.0712020376363620     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  6.35828D+01    |proj g|=  1.13200D+02

At iterate    5    f=  4.93688D+00    |proj g|=  1.20529D+00

At iterate   10    f=  4.17830D+00    |proj g|=  3.99986D-02

At iterate   15    f=  4.12829D+00    |proj g|=  1.15428D-02

At iterate   20    f=  4.12649D+00    |proj g|=  1.77726D-02

At iterate   25    f=  4.12338D+00    |proj g|=  5.34762D-03

At iterate   30    f=  4.12239D+00    |proj g|=  5.58811D-03

At iterate   35    f=  4.12203D+00    |proj g|=  3.43583D-03

At iterate   40    f=  4.12190D+00    |proj g|=  1.45559D-03

At iterate   45    f=  4.12187D+00    |proj g|=  3.87937D-05

At iterate   50    f=  4.12186D+00    |proj g|=  1.11864D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     52     67      1     0     0   7.215D-06   4.122D+00
  F =   4.1218637093342849     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.48432D+00    |proj g|=  2.33971D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
At iterate    5    f=  4.48328D+00    |proj g|=  6.89507D-03

At iterate   10    f=  4.48291D+00    |proj g|=  6.29915D-04

At iterate   15    f=  4.48260D+00    |proj g|=  9.12864D-03

At iterate   20    f=  4.48180D+00    |proj g|=  2.60790D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     22     25      1     0     0   3.624D-06   4.482D+00
  F =   4.4817978990725509     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.81744D+00    |proj g|=  9.30036D-02

At iterate    5    f=  4.83321D+00    |proj g|=  1.21220D-02

At iterate   10    f=  4.83268D+00    |proj g|=  4.51306D-03

At iterate   15    f=  4.83147D+00    |proj g|=  6.99849D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     19     27      1     0     0   1.891D-05   4.831D+00
  F =   4.8311997703033036     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.21872D+00    |proj g|=  1.03325D-01

At iterate    5    f=  4.20421D+00    |proj g|=  1.99489D-02

At iterate   10    f=  4.20179D+00    |proj g|=  8.79521D-03

At iterate   15    f=  4.20036D+00    |proj g|=  6.45740D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     17     19      1     0     0   9.986D-06   4.200D+00
  F =   4.2003553192212344     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.83776D+00    |proj g|=  3.57960D-02
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
At iterate    5    f=  4.83629D+00    |proj g|=  6.28700D-03

At iterate   10    f=  4.83586D+00    |proj g|=  2.16679D-04

At iterate   15    f=  4.83584D+00    |proj g|=  2.07161D-03

At iterate   20    f=  4.83582D+00    |proj g|=  2.69287D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     20     23      1     0     0   2.693D-06   4.836D+00
  F =   4.8358245384094580     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:997: UserWarning: Non-stationary starting seasonal autoregressive Using zeros as starting parameters.
  warn('Non-stationary starting seasonal autoregressive'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.14789D+00    |proj g|=  3.28728D-02

At iterate    5    f=  4.14576D+00    |proj g|=  1.99976D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     12      1     0     0   2.321D-06   4.146D+00
  F =   4.1457248064741483     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.03019D+00    |proj g|=  1.80975D-01

At iterate    5    f=  4.14417D+00    |proj g|=  1.14671D-02

At iterate   10    f=  4.14390D+00    |proj g|=  9.04800D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     11     20      1     0     0   3.704D-06   4.144D+00
  F =   4.1438974179963894     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.38072D+00    |proj g|=  7.52851D-02

At iterate    5    f=  4.48119D+00    |proj g|=  6.94961D-03

At iterate   10    f=  4.48089D+00    |proj g|=  5.52692D-04

At iterate   15    f=  4.48068D+00    |proj g|=  9.05759D-03

At iterate   20    f=  4.47979D+00    |proj g|=  5.60092D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     22     29      1     0     0   1.273D-06   4.480D+00
  F =   4.4797882115224148     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.67737D+00    |proj g|=  5.66037D-02

At iterate    5    f=  4.60603D+00    |proj g|=  1.76518D-04

At iterate   10    f=  4.60573D+00    |proj g|=  4.14175D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     10     22      1     0     0   4.142D-06   4.606D+00
  F =   4.6057309243682329     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/3722978025.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["units_sold_counterfactual"] = df["units_sold_counterfactual"].astype(float)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  6.37986D+01    |proj g|=  1.14426D+02

At iterate    5    f=  4.93516D+00    |proj g|=  1.16147D+00

At iterate   10    f=  4.24118D+00    |proj g|=  4.06683D-02

At iterate   15    f=  4.19767D+00    |proj g|=  7.71177D-03

At iterate   20    f=  4.19674D+00    |proj g|=  8.66532D-04

At iterate   25    f=  4.19058D+00    |proj g|=  1.92298D-02

At iterate   30    f=  4.18871D+00    |proj g|=  1.06526D-02

At iterate   35    f=  4.18759D+00    |proj g|=  2.76919D-03

At iterate   40    f=  4.18703D+00    |proj g|=  5.37595D-04

At iterate   45    f=  4.18678D+00    |proj g|=  2.64521D-03

At iterate   50    f=  4.18659D+00    |proj g|=  7.65439D-04

At iterate   55    f=  4.18650D+00    |proj g|=  1.57742D-03

At iterate   60    f=  4.18644D+00    |proj g|=  1.08420D-03

At iterate   65    f=  4.18640D+00    |proj g|=  1.32502D-03

At iterate   70    f=  4.18636D+00    |proj g|=  4.03906D-04

At iterate   75    f=  4.18635D+00    |proj g|=  3.87519D-04

At iterate   80    f=  4.18634D+00    |proj g|=  2.14401D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     84    115      1     0     0   1.379D-04   4.186D+00
  F =   4.1863253113752279     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
In [61]:
# Results of the Model Testing for each product

print("Evaluation Metrics - SARIMA")
print("--------------------------------")
print("--------------------------------")

for item in product_list:
    print(f"Evaluation Metrics - Item {item}")
    var_name = "evaluation_metrics_" + str(item) + "_SARIMA"
    print(globals()[var_name])
    print("--------------------------------")
Evaluation Metrics - SARIMA
--------------------------------
--------------------------------
Evaluation Metrics - Item 394846541
{'Mean Absolute Error': 44.80383023472661, 'Mean Squared Error': 4238.698724190287, 'Root Mean Squared Error': 65.1052895254317, 'MAPE': 8.499142552019952}
--------------------------------
Evaluation Metrics - Item 394848615
{'Mean Absolute Error': 15.261430687690485, 'Mean Squared Error': 305.0802965228306, 'Root Mean Squared Error': 17.466547928048936, 'MAPE': 5.289019768312291}
--------------------------------
Evaluation Metrics - Item 394851407
{'Mean Absolute Error': 5.6232588769644565, 'Mean Squared Error': 31.62104039735956, 'Root Mean Squared Error': 5.6232588769644565, 'MAPE': 2.213881447623802}
--------------------------------
Evaluation Metrics - Item 394857109
{'Mean Absolute Error': 18.85123127526903, 'Mean Squared Error': 600.4589391909043, 'Root Mean Squared Error': 24.50426369411871, 'MAPE': 6.390355104934672}
--------------------------------
Evaluation Metrics - Item 394858170
{'Mean Absolute Error': 9.035293993895134, 'Mean Squared Error': 102.9286792617397, 'Root Mean Squared Error': 10.145377236048924, 'MAPE': 6.567768521051544}
--------------------------------
Evaluation Metrics - Item 394860161
{'Mean Absolute Error': 22.61217249431983, 'Mean Squared Error': 733.5883464141427, 'Root Mean Squared Error': 27.08483609723608, 'MAPE': 7.423155721957897}
--------------------------------
Evaluation Metrics - Item 394865583
{'Mean Absolute Error': 21.52490262136007, 'Mean Squared Error': 463.3214328590336, 'Root Mean Squared Error': 21.52490262136007, 'MAPE': 4.262356954724766}
--------------------------------
Evaluation Metrics - Item 394866466
{'Mean Absolute Error': 6.333909185344634, 'Mean Squared Error': 55.49947137158789, 'Root Mean Squared Error': 7.449796733575211, 'MAPE': 4.610070507947828}
--------------------------------
Evaluation Metrics - Item 394873848
{'Mean Absolute Error': 11.867978550974792, 'Mean Squared Error': 176.18527759910432, 'Root Mean Squared Error': 13.273480236889808, 'MAPE': 15.053072528847638}
--------------------------------
Evaluation Metrics - Item 394882669
{'Mean Absolute Error': 17.994538771228008, 'Mean Squared Error': 323.80342558922797, 'Root Mean Squared Error': 17.994538771228008, 'MAPE': 3.5492186925499025}
--------------------------------
Evaluation Metrics - Item 394885779
{'Mean Absolute Error': 44.621259823136995, 'Mean Squared Error': 1991.0568282038998, 'Root Mean Squared Error': 44.621259823136995, 'MAPE': 10.23423390438922}
--------------------------------
Evaluation Metrics - Item 394885885
{'Mean Absolute Error': 11.525835234114297, 'Mean Squared Error': 230.18957408580798, 'Root Mean Squared Error': 15.171999673273394, 'MAPE': 9.63905413090407}
--------------------------------
Evaluation Metrics - Item 394890521
{'Mean Absolute Error': 26.59266412924984, 'Mean Squared Error': 707.1697854910911, 'Root Mean Squared Error': 26.59266412924984, 'MAPE': 6.438901726210615}
--------------------------------
Evaluation Metrics - Item 394893706
{'Mean Absolute Error': 16.654050860453843, 'Mean Squared Error': 366.18304312027084, 'Root Mean Squared Error': 19.135909780312794, 'MAPE': 21.392019128616038}
--------------------------------
Evaluation Metrics - Item 394904090
{'Mean Absolute Error': 23.953572864970738, 'Mean Squared Error': 573.7736529974625, 'Root Mean Squared Error': 23.953572864970738, 'MAPE': 7.984524288323579}
--------------------------------
Evaluation Metrics - Item 394907934
{'Mean Absolute Error': 19.727953791109474, 'Mean Squared Error': 419.305891799839, 'Root Mean Squared Error': 20.47696002339798, 'MAPE': 27.667809883190415}
--------------------------------
Evaluation Metrics - Item 394909995
{'Mean Absolute Error': 17.226131212823866, 'Mean Squared Error': 549.9923549173938, 'Root Mean Squared Error': 23.45191580484191, 'MAPE': 6.569951742275913}
--------------------------------
Evaluation Metrics - Item 394914459
{'Mean Absolute Error': 1.9550787520007589, 'Mean Squared Error': 3.8223329265248447, 'Root Mean Squared Error': 1.9550787520007589, 'MAPE': 1.3865806751778433}
--------------------------------
Evaluation Metrics - Item 394915909
{'Mean Absolute Error': 23.566008331816633, 'Mean Squared Error': 803.1720410312558, 'Root Mean Squared Error': 28.340290066110047, 'MAPE': 5.71224141737249}
--------------------------------
Evaluation Metrics - Item 394917065
{'Mean Absolute Error': 46.56168790223039, 'Mean Squared Error': 2779.4401954001182, 'Root Mean Squared Error': 52.72039638887513, 'MAPE': 9.30848848706606}
--------------------------------
Evaluation Metrics - Item 394917897
{'Mean Absolute Error': 8.60844513240655, 'Mean Squared Error': 99.31471986419207, 'Root Mean Squared Error': 9.965677090102412, 'MAPE': 13.149273349411075}
--------------------------------
Evaluation Metrics - Item 394924633
{'Mean Absolute Error': 31.69430193598447, 'Mean Squared Error': 1004.528775209349, 'Root Mean Squared Error': 31.69430193598447, 'MAPE': 6.083359296734063}
--------------------------------
Evaluation Metrics - Item 394930015
{'Mean Absolute Error': 10.445791439312165, 'Mean Squared Error': 204.5495300578128, 'Root Mean Squared Error': 14.30208131908824, 'MAPE': 15.8258781124236}
--------------------------------
Evaluation Metrics - Item 394930651
{'Mean Absolute Error': 18.29879612037675, 'Mean Squared Error': 334.8459394551152, 'Root Mean Squared Error': 18.29879612037675, 'MAPE': 11.959997464298528}
--------------------------------
Evaluation Metrics - Item 394931359
{'Mean Absolute Error': 29.42200122467625, 'Mean Squared Error': 1172.4370983137808, 'Root Mean Squared Error': 34.24086883117572, 'MAPE': 10.487464677375291}
--------------------------------
Evaluation Metrics - Item 394940184
{'Mean Absolute Error': 18.239861012000937, 'Mean Squared Error': 647.9159783498795, 'Root Mean Squared Error': 25.454193728143885, 'MAPE': 3.644890082172266}
--------------------------------
Evaluation Metrics - Item 394941031
{'Mean Absolute Error': 23.463384417792334, 'Mean Squared Error': 859.8610897306089, 'Root Mean Squared Error': 29.323388101149035, 'MAPE': 7.705949305288443}
--------------------------------
Evaluation Metrics - Item 394941377
{'Mean Absolute Error': 34.505220889602505, 'Mean Squared Error': 1720.7337997454026, 'Root Mean Squared Error': 41.48172850479356, 'MAPE': 11.38751916362757}
--------------------------------
Evaluation Metrics - Item 394942170
{'Mean Absolute Error': 18.60340994244045, 'Mean Squared Error': 612.4213393901639, 'Root Mean Squared Error': 24.747148106199305, 'MAPE': 4.217909187582196}
--------------------------------
Evaluation Metrics - Item 394942631
{'Mean Absolute Error': 29.30507875170631, 'Mean Squared Error': 1025.2206563075824, 'Root Mean Squared Error': 32.019067074285324, 'MAPE': 6.023523224633279}
--------------------------------
Evaluation Metrics - Item 394950597
{'Mean Absolute Error': 26.42390815520617, 'Mean Squared Error': 888.2982528644272, 'Root Mean Squared Error': 29.80433278676822, 'MAPE': 6.4078209623793345}
--------------------------------
Evaluation Metrics - Item 394951176
{'Mean Absolute Error': 22.232857235674317, 'Mean Squared Error': 738.1734990379845, 'Root Mean Squared Error': 27.169348520676465, 'MAPE': 4.427394676276261}
--------------------------------
Evaluation Metrics - Item 395052168
{'Mean Absolute Error': 33.12709105405901, 'Mean Squared Error': 1486.2705654491597, 'Root Mean Squared Error': 38.552179775586744, 'MAPE': 6.475021610016767}
--------------------------------
Evaluation Metrics - Item 395357341
{'Mean Absolute Error': 15.693614357088455, 'Mean Squared Error': 266.017772202965, 'Root Mean Squared Error': 16.31005126303915, 'MAPE': 10.921625431408307}
--------------------------------
Evaluation Metrics - Item 395368886
{'Mean Absolute Error': 37.62870089293366, 'Mean Squared Error': 1916.4214883573761, 'Root Mean Squared Error': 43.77695156537714, 'MAPE': 24.479614804590014}
--------------------------------
Evaluation Metrics - Item 395375136
{'Mean Absolute Error': 50.188326919899005, 'Mean Squared Error': 3301.3712793354666, 'Root Mean Squared Error': 57.457560680344464, 'MAPE': 16.57058541972731}
--------------------------------
Evaluation Metrics - Item 395382145
{'Mean Absolute Error': 13.473150971990558, 'Mean Squared Error': 294.8544622741033, 'Root Mean Squared Error': 17.17132674763669, 'MAPE': 9.23640608891057}
--------------------------------
Evaluation Metrics - Item 395384129
{'Mean Absolute Error': 19.049206525502893, 'Mean Squared Error': 458.0120144140334, 'Root Mean Squared Error': 21.40121525554176, 'MAPE': 11.609907591911297}
--------------------------------
Evaluation Metrics - Item 511584598
{'Mean Absolute Error': 24.739288687424377, 'Mean Squared Error': 842.8685894732001, 'Root Mean Squared Error': 29.032199184236802, 'MAPE': 8.559828996966267}
--------------------------------
Evaluation Metrics - Item 512317690
{'Mean Absolute Error': 17.33022463099749, 'Mean Squared Error': 559.6995469445022, 'Root Mean Squared Error': 23.657970051221685, 'MAPE': 11.089316282875085}
--------------------------------
Evaluation Metrics - Item 512317697
{'Mean Absolute Error': 55.87681169245238, 'Mean Squared Error': 3984.008234048389, 'Root Mean Squared Error': 63.1190005786561, 'MAPE': 19.88916442315835}
--------------------------------
Evaluation Metrics - Item 512317702
{'Mean Absolute Error': 62.74488431730989, 'Mean Squared Error': 4644.468659738142, 'Root Mean Squared Error': 68.15033866194754, 'MAPE': 21.384727977305594}
--------------------------------
Evaluation Metrics - Item 512317726
{'Mean Absolute Error': 33.91503539612516, 'Mean Squared Error': 1415.7300895539067, 'Root Mean Squared Error': 37.62618887894317, 'MAPE': 43.24031121148331}
--------------------------------
Evaluation Metrics - Item 512317737
{'Mean Absolute Error': 42.36270613505235, 'Mean Squared Error': 2531.268738208856, 'Root Mean Squared Error': 50.311715715217424, 'MAPE': 11.832158708032372}
--------------------------------
Evaluation Metrics - Item 512317760
{'Mean Absolute Error': 23.02398538596151, 'Mean Squared Error': 601.0941859413891, 'Root Mean Squared Error': 24.517222231349724, 'MAPE': 35.89203794940403}
--------------------------------
Evaluation Metrics - Item 512317763
{'Mean Absolute Error': 14.674820693718905, 'Mean Squared Error': 275.19744103295994, 'Root Mean Squared Error': 16.58907595476493, 'MAPE': 18.9030444211949}
--------------------------------
Evaluation Metrics - Item 512319115
{'Mean Absolute Error': 22.464126338853163, 'Mean Squared Error': 658.5469888561005, 'Root Mean Squared Error': 25.66217038475313, 'MAPE': 30.300311371090906}
--------------------------------
Evaluation Metrics - Item 512319119
{'Mean Absolute Error': 15.054712791661304, 'Mean Squared Error': 390.3115020109277, 'Root Mean Squared Error': 19.756302842660812, 'MAPE': 11.24223925653468}
--------------------------------
Evaluation Metrics - Item 512319130
{'Mean Absolute Error': 42.20589848087171, 'Mean Squared Error': 2514.00720219079, 'Root Mean Squared Error': 50.13987636792486, 'MAPE': 14.737259049067633}
--------------------------------
Evaluation Metrics - Item 512319152
{'Mean Absolute Error': 31.147872378911927, 'Mean Squared Error': 1364.5035910754589, 'Root Mean Squared Error': 36.939187742497246, 'MAPE': 11.059611615498836}
--------------------------------
Evaluation Metrics - Item 512319154
{'Mean Absolute Error': 43.611932578060696, 'Mean Squared Error': 2606.9609201549924, 'Root Mean Squared Error': 51.05840694885605, 'MAPE': 9.182528084308451}
--------------------------------
Evaluation Metrics - Item 512319978
{'Mean Absolute Error': 16.206953242796786, 'Mean Squared Error': 407.2517198737462, 'Root Mean Squared Error': 20.180478682968502, 'MAPE': 5.56606141006688}
--------------------------------
Evaluation Metrics - Item 512319985
{'Mean Absolute Error': 16.721560086715556, 'Mean Squared Error': 447.16482640238735, 'Root Mean Squared Error': 21.14627216325344, 'MAPE': 5.93152370000601}
--------------------------------
Evaluation Metrics - Item 512320013
{'Mean Absolute Error': 33.29741610049194, 'Mean Squared Error': 1421.1456287141732, 'Root Mean Squared Error': 37.69808521283506, 'MAPE': 21.659237640761745}
--------------------------------
Evaluation Metrics - Item 512320017
{'Mean Absolute Error': 30.49095774841582, 'Mean Squared Error': 1526.0128497375679, 'Root Mean Squared Error': 39.06421443901782, 'MAPE': 7.285631610144988}
--------------------------------
Evaluation Metrics - Item 512464613
{'Mean Absolute Error': 15.357783990334031, 'Mean Squared Error': 346.5891817455354, 'Root Mean Squared Error': 18.61690580481986, 'MAPE': 10.609735451577617}
--------------------------------
Evaluation Metrics - Item 512464615
{'Mean Absolute Error': 27.24686799404184, 'Mean Squared Error': 1219.349081882407, 'Root Mean Squared Error': 34.91917928420436, 'MAPE': 9.355688079474232}
--------------------------------
Evaluation Metrics - Item 512464625
{'Mean Absolute Error': 24.052532452436296, 'Mean Squared Error': 759.5428976475681, 'Root Mean Squared Error': 27.559805834721843, 'MAPE': 8.226765592612992}
--------------------------------
Evaluation Metrics - Item 512464633
{'Mean Absolute Error': 19.22157245725067, 'Mean Squared Error': 605.3973304004296, 'Root Mean Squared Error': 24.604823315773466, 'MAPE': 13.068273050389015}
--------------------------------
Evaluation Metrics - Item 512464642
{'Mean Absolute Error': 24.471392047116506, 'Mean Squared Error': 867.3565290293011, 'Root Mean Squared Error': 29.450917286721328, 'MAPE': 36.488460852148776}
--------------------------------
Evaluation Metrics - Item 512464646
{'Mean Absolute Error': 25.5458145690625, 'Mean Squared Error': 1115.0171853968607, 'Root Mean Squared Error': 33.391873044153435, 'MAPE': 5.078258228017823}
--------------------------------
Evaluation Metrics - Item 512464651
{'Mean Absolute Error': 41.08608152781781, 'Mean Squared Error': 2390.0932996202578, 'Root Mean Squared Error': 48.8885804623151, 'MAPE': 8.039094069694757}
--------------------------------
Evaluation Metrics - Item 512464658
{'Mean Absolute Error': 52.89188901194975, 'Mean Squared Error': 3651.5801412706437, 'Root Mean Squared Error': 60.42830579513746, 'MAPE': 12.214896531654906}
--------------------------------
Evaluation Metrics - Item 514002189
{'Mean Absolute Error': 18.54532257022707, 'Mean Squared Error': 592.4079720655728, 'Root Mean Squared Error': 24.339432451591243, 'MAPE': 6.388467696419525}
--------------------------------
Evaluation Metrics - Item 515375115
{'Mean Absolute Error': 17.686258380345986, 'Mean Squared Error': 486.9179789067931, 'Root Mean Squared Error': 22.06621804720494, 'MAPE': 26.669113109629116}
--------------------------------
Evaluation Metrics - Item 515702203
{'Mean Absolute Error': 16.222303111001782, 'Mean Squared Error': 384.6359894242438, 'Root Mean Squared Error': 19.61213882839513, 'MAPE': 5.685320675515583}
--------------------------------
Evaluation Metrics - Item 515775902
{'Mean Absolute Error': 15.922559281800554, 'Mean Squared Error': 348.5765775937713, 'Root Mean Squared Error': 18.670205611984333, 'MAPE': 3.9724793831780048}
--------------------------------
Evaluation Metrics - Item 515775912
{'Mean Absolute Error': 27.44845738033821, 'Mean Squared Error': 1177.109631358616, 'Root Mean Squared Error': 34.30903133809837, 'MAPE': 6.412400102833217}
--------------------------------
Evaluation Metrics - Item 515775929
{'Mean Absolute Error': 22.22139512117893, 'Mean Squared Error': 675.5601011923562, 'Root Mean Squared Error': 25.991539030853026, 'MAPE': 15.563544222114176}
--------------------------------
Evaluation Metrics - Item 515775953
{'Mean Absolute Error': 73.46423420033632, 'Mean Squared Error': 6223.051612104631, 'Root Mean Squared Error': 78.88632081739287, 'MAPE': 14.679175059070149}
--------------------------------
Evaluation Metrics - Item 515775957
{'Mean Absolute Error': 19.18814278330666, 'Mean Squared Error': 585.2570932707782, 'Root Mean Squared Error': 24.192087410365776, 'MAPE': 12.843779829189897}
--------------------------------
Evaluation Metrics - Item 516001717
{'Mean Absolute Error': 14.834762605897614, 'Mean Squared Error': 399.6570138500593, 'Root Mean Squared Error': 19.991423507345825, 'MAPE': 18.951586583837116}
--------------------------------
Evaluation Metrics - Item 516001998
{'Mean Absolute Error': 16.764635200615217, 'Mean Squared Error': 421.86638511221594, 'Root Mean Squared Error': 20.539386191223336, 'MAPE': 5.841257182206816}
--------------------------------
Evaluation Metrics - Item 516002000
{'Mean Absolute Error': 33.97043028674777, 'Mean Squared Error': 1614.3496241206985, 'Root Mean Squared Error': 40.178969923589364, 'MAPE': 7.65817212440929}
--------------------------------
Evaluation Metrics - Item 516007566
{'Mean Absolute Error': 28.82758543745674, 'Mean Squared Error': 1737.5354129880507, 'Root Mean Squared Error': 41.68375478514442, 'MAPE': 9.20627038199973}
--------------------------------
In [62]:
combined_df_SARIMA = pd.DataFrame()

for index, item in enumerate(product_list):
    subset_promoted = filtered_df_promotion[filtered_df_promotion["item_id"] == item]
    subset_promoted_v1 = subset_promoted.asfreq('W-MON')
    predictions = pd.Series(np.array(models_SARIMA[index].forecast(len(subset_promoted_v1))))
    subset_promoted = subset_promoted.reset_index().rename(columns={'index': 'history_date'})
    subset_promoted["units_sold_counterfactual_SARIMA"] = predictions
    combined_df_SARIMA = pd.concat([combined_df_SARIMA, subset_promoted], axis=0)

ARIMA Model¶

In [63]:
# ARIMA model, Train Test split and Model Evaluation

def ARIMA_train_test(df):
     
    train_size = int(len(df) * 0.8)
    train_data, test_data = df[:train_size], df[train_size:]

    train_data = train_data.asfreq('W-MON')
    test_data = test_data.asfreq('W-MON')

    # Fit the ARIMA model on the training data
    model = ARIMA(train_data["units_sold_counterfactual"], order=(1, 1, 1))
    model_fit = model.fit()
    
    # Get the predicted values for the test data
    predictions = model_fit.predict(start=test_data.index[0], end=test_data.index[-1], dynamic=False)

    # Compute accuracy measures
    mse = np.mean((test_data["units_sold_counterfactual"] - predictions)**2)  # Mean Squared Error
    rmse = np.sqrt(mse)  # Root Mean Squared Error
    mae = np.mean(np.abs(test_data["units_sold_counterfactual"] - predictions))  # Mean Absolute Error
    mape = np.mean(np.abs((test_data["units_sold_counterfactual"] - predictions) / test_data["units_sold_counterfactual"])) * 100  # Mean Absolute Percentage Error
    
    evaluation_metrics = {
                        'Mean Absolute Error': mae,
                        'Mean Squared Error': mse,
                        'Root Mean Squared Error': rmse,
                        'MAPE': mape,
                         }

    
    return predictions, evaluation_metrics, model_fit
In [64]:
# Model Fitting ARIMA

models_ARIMA = []

for item in product_list:
    var_name_prediction_v1 = "prediction_" + str(item) + "_ARIMA"
    var_name_em_v1 = "evaluation_metrics_" + str(item) + "_ARIMA"
    globals()[var_name_prediction_v1], globals()[var_name_em_v1], ARIMA_model = ARIMA_train_test(filtered_df_non_promotion[filtered_df_non_promotion['item_id'] == item])
    models_ARIMA.append(ARIMA_model)
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/base/model.py:604: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  warnings.warn("Maximum Likelihood optimization failed to "
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/base/model.py:604: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  warnings.warn("Maximum Likelihood optimization failed to "
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
  warn('Too few observations to estimate starting parameters%s.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
/Users/sababasaadusmani/opt/anaconda3/lib/python3.9/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'
In [65]:
# Results of the Model Testing for each product

print("Evaluation Metrics - ARIMA")
print("--------------------------------")
print("--------------------------------")

for item in product_list:
    print(f"Evaluation Metrics - Item {item}")
    var_name = "evaluation_metrics_" + str(item) + "_ARIMA"
    print(globals()[var_name])
    print("--------------------------------")
Evaluation Metrics - ARIMA
--------------------------------
--------------------------------
Evaluation Metrics - Item 394846541
{'Mean Absolute Error': 28.312860179566, 'Mean Squared Error': 1255.2096321026318, 'Root Mean Squared Error': 35.42893777835615, 'MAPE': 5.5750731254070995}
--------------------------------
Evaluation Metrics - Item 394848615
{'Mean Absolute Error': 10.692853976435586, 'Mean Squared Error': 215.36414170338713, 'Root Mean Squared Error': 14.6752901744186, 'MAPE': 3.8754720772491105}
--------------------------------
Evaluation Metrics - Item 394851407
{'Mean Absolute Error': 10.848138718228995, 'Mean Squared Error': 117.68211364993903, 'Root Mean Squared Error': 10.848138718228995, 'MAPE': 4.270920755208266}
--------------------------------
Evaluation Metrics - Item 394857109
{'Mean Absolute Error': 17.657405695454635, 'Mean Squared Error': 561.2419986121574, 'Root Mean Squared Error': 23.690546608555856, 'MAPE': 5.99557484054373}
--------------------------------
Evaluation Metrics - Item 394858170
{'Mean Absolute Error': 9.534871009737955, 'Mean Squared Error': 116.50991753981025, 'Root Mean Squared Error': 10.793975983844426, 'MAPE': 6.995200687597414}
--------------------------------
Evaluation Metrics - Item 394860161
{'Mean Absolute Error': 21.215206801797514, 'Mean Squared Error': 736.8127116191037, 'Root Mean Squared Error': 27.144294273734648, 'MAPE': 6.941332419776296}
--------------------------------
Evaluation Metrics - Item 394865583
{'Mean Absolute Error': 28.63665832847113, 'Mean Squared Error': 820.0582002215948, 'Root Mean Squared Error': 28.63665832847113, 'MAPE': 5.670625411578442}
--------------------------------
Evaluation Metrics - Item 394866466
{'Mean Absolute Error': 7.968251681574078, 'Mean Squared Error': 102.6869905897509, 'Root Mean Squared Error': 10.133458964724282, 'MAPE': 5.98790748138026}
--------------------------------
Evaluation Metrics - Item 394873848
{'Mean Absolute Error': 5.869493912543682, 'Mean Squared Error': 58.95086898031288, 'Root Mean Squared Error': 7.677946924817395, 'MAPE': 7.991764654526455}
--------------------------------
Evaluation Metrics - Item 394882669
{'Mean Absolute Error': 6.229101910920065, 'Mean Squared Error': 38.801710616628, 'Root Mean Squared Error': 6.229101910920065, 'MAPE': 1.2286197062958708}
--------------------------------
Evaluation Metrics - Item 394885779
{'Mean Absolute Error': 17.675836287244636, 'Mean Squared Error': 312.4351884534742, 'Root Mean Squared Error': 17.675836287244636, 'MAPE': 4.054090891569871}
--------------------------------
Evaluation Metrics - Item 394885885
{'Mean Absolute Error': 11.714008855482014, 'Mean Squared Error': 230.38597165528427, 'Root Mean Squared Error': 15.178470662595895, 'MAPE': 9.767124977050806}
--------------------------------
Evaluation Metrics - Item 394890521
{'Mean Absolute Error': 22.40060073371609, 'Mean Squared Error': 501.7869132313618, 'Root Mean Squared Error': 22.40060073371609, 'MAPE': 5.423874269664912}
--------------------------------
Evaluation Metrics - Item 394893706
{'Mean Absolute Error': 8.54597800526918, 'Mean Squared Error': 93.5759821001462, 'Root Mean Squared Error': 9.67346794588922, 'MAPE': 11.778859081224335}
--------------------------------
Evaluation Metrics - Item 394904090
{'Mean Absolute Error': 20.32389812866444, 'Mean Squared Error': 413.0608351443299, 'Root Mean Squared Error': 20.32389812866444, 'MAPE': 6.774632709554814}
--------------------------------
Evaluation Metrics - Item 394907934
{'Mean Absolute Error': 5.792230161182623, 'Mean Squared Error': 42.220687120701534, 'Root Mean Squared Error': 6.497744771895979, 'MAPE': 8.42608582909782}
--------------------------------
Evaluation Metrics - Item 394909995
{'Mean Absolute Error': 23.325024749767273, 'Mean Squared Error': 835.6014514154662, 'Root Mean Squared Error': 28.906771722478215, 'MAPE': 8.85514218885883}
--------------------------------
Evaluation Metrics - Item 394914459
{'Mean Absolute Error': 1.4877409419966625, 'Mean Squared Error': 2.2133731104931167, 'Root Mean Squared Error': 1.4877409419966625, 'MAPE': 1.055135419855789}
--------------------------------
Evaluation Metrics - Item 394915909
{'Mean Absolute Error': 23.67321856568355, 'Mean Squared Error': 841.8231645571733, 'Root Mean Squared Error': 29.014189021187086, 'MAPE': 5.960790306668106}
--------------------------------
Evaluation Metrics - Item 394917065
{'Mean Absolute Error': 24.393265140443134, 'Mean Squared Error': 959.8578006620718, 'Root Mean Squared Error': 30.981571952728153, 'MAPE': 5.1479773130344135}
--------------------------------
Evaluation Metrics - Item 394917897
{'Mean Absolute Error': 8.976515351822142, 'Mean Squared Error': 108.57370588090397, 'Root Mean Squared Error': 10.419870722849875, 'MAPE': 14.304826994934242}
--------------------------------
Evaluation Metrics - Item 394924633
{'Mean Absolute Error': 0.46549156332866914, 'Mean Squared Error': 0.2166823955301684, 'Root Mean Squared Error': 0.46549156332866914, 'MAPE': 0.08934578950646241}
--------------------------------
Evaluation Metrics - Item 394930015
{'Mean Absolute Error': 13.237059932446355, 'Mean Squared Error': 260.1261490604483, 'Root Mean Squared Error': 16.128426738539886, 'MAPE': 19.50586583106151}
--------------------------------
Evaluation Metrics - Item 394930651
{'Mean Absolute Error': 22.703443144505883, 'Mean Squared Error': 515.4463306158111, 'Root Mean Squared Error': 22.703443144505883, 'MAPE': 14.838851728435218}
--------------------------------
Evaluation Metrics - Item 394931359
{'Mean Absolute Error': 12.688796536067223, 'Mean Squared Error': 290.66987778469405, 'Root Mean Squared Error': 17.0490433099542, 'MAPE': 4.58925262664145}
--------------------------------
Evaluation Metrics - Item 394940184
{'Mean Absolute Error': 16.73115382715565, 'Mean Squared Error': 433.6139237714948, 'Root Mean Squared Error': 20.823398468345527, 'MAPE': 3.4381647382788496}
--------------------------------
Evaluation Metrics - Item 394941031
{'Mean Absolute Error': 20.721137978419655, 'Mean Squared Error': 650.8087519864193, 'Root Mean Squared Error': 25.51095356874022, 'MAPE': 6.871857060551677}
--------------------------------
Evaluation Metrics - Item 394941377
{'Mean Absolute Error': 22.036742017714733, 'Mean Squared Error': 793.8767350297471, 'Root Mean Squared Error': 28.175818267261505, 'MAPE': 7.562736551106268}
--------------------------------
Evaluation Metrics - Item 394942170
{'Mean Absolute Error': 16.8746170441334, 'Mean Squared Error': 360.8718285928033, 'Root Mean Squared Error': 18.99662676879249, 'MAPE': 3.9338510305364687}
--------------------------------
Evaluation Metrics - Item 394942631
{'Mean Absolute Error': 21.263594744651023, 'Mean Squared Error': 1040.27077210778, 'Root Mean Squared Error': 32.25322886329026, 'MAPE': 4.644828702323141}
--------------------------------
Evaluation Metrics - Item 394950597
{'Mean Absolute Error': 30.80690270466294, 'Mean Squared Error': 1163.1962808917262, 'Root Mean Squared Error': 34.1056634723872, 'MAPE': 7.717539727341116}
--------------------------------
Evaluation Metrics - Item 394951176
{'Mean Absolute Error': 13.192577523863463, 'Mean Squared Error': 243.1644865889646, 'Root Mean Squared Error': 15.59373228540764, 'MAPE': 2.65877376996835}
--------------------------------
Evaluation Metrics - Item 395052168
{'Mean Absolute Error': 15.792567101801843, 'Mean Squared Error': 453.6258581511438, 'Root Mean Squared Error': 21.29849426957558, 'MAPE': 3.101124536712086}
--------------------------------
Evaluation Metrics - Item 395357341
{'Mean Absolute Error': 5.877133872575957, 'Mean Squared Error': 96.90115176088689, 'Root Mean Squared Error': 9.843838263649342, 'MAPE': 4.532237900294273}
--------------------------------
Evaluation Metrics - Item 395368886
{'Mean Absolute Error': 15.960520538350455, 'Mean Squared Error': 366.56625284331335, 'Root Mean Squared Error': 19.145920005142436, 'MAPE': 10.905289674633458}
--------------------------------
Evaluation Metrics - Item 395375136
{'Mean Absolute Error': 22.713247961043294, 'Mean Squared Error': 913.9269046968084, 'Root Mean Squared Error': 30.23122400262365, 'MAPE': 7.808436831149113}
--------------------------------
Evaluation Metrics - Item 395382145
{'Mean Absolute Error': 11.195872587106209, 'Mean Squared Error': 228.34397409863334, 'Root Mean Squared Error': 15.111054698419741, 'MAPE': 8.292833771963926}
--------------------------------
Evaluation Metrics - Item 395384129
{'Mean Absolute Error': 13.336386483171552, 'Mean Squared Error': 288.8623949265908, 'Root Mean Squared Error': 16.99595231008227, 'MAPE': 8.046117215434855}
--------------------------------
Evaluation Metrics - Item 511584598
{'Mean Absolute Error': 14.913920183138046, 'Mean Squared Error': 336.93231497273223, 'Root Mean Squared Error': 18.35571613892338, 'MAPE': 5.130032996699329}
--------------------------------
Evaluation Metrics - Item 512317690
{'Mean Absolute Error': 16.897412458988956, 'Mean Squared Error': 396.18168850898707, 'Root Mean Squared Error': 19.90431331417859, 'MAPE': 11.912574355791849}
--------------------------------
Evaluation Metrics - Item 512317697
{'Mean Absolute Error': 19.3379126322417, 'Mean Squared Error': 580.6956131193616, 'Root Mean Squared Error': 24.097626711345697, 'MAPE': 7.21147054693637}
--------------------------------
Evaluation Metrics - Item 512317702
{'Mean Absolute Error': 21.905722797583117, 'Mean Squared Error': 783.337931530511, 'Root Mean Squared Error': 27.988174851721055, 'MAPE': 7.815194241613686}
--------------------------------
Evaluation Metrics - Item 512317726
{'Mean Absolute Error': 12.505416759839612, 'Mean Squared Error': 268.57869865229094, 'Root Mean Squared Error': 16.38837083581803, 'MAPE': 14.974818527391728}
--------------------------------
Evaluation Metrics - Item 512317737
{'Mean Absolute Error': 51.29673069598962, 'Mean Squared Error': 3346.8565025227545, 'Root Mean Squared Error': 57.852022458361425, 'MAPE': 14.20445365841643}
--------------------------------
Evaluation Metrics - Item 512317760
{'Mean Absolute Error': 32.565327029612156, 'Mean Squared Error': 1155.3263292935064, 'Root Mean Squared Error': 33.99009163408517, 'MAPE': 50.35075082380498}
--------------------------------
Evaluation Metrics - Item 512317763
{'Mean Absolute Error': 7.624361509484493, 'Mean Squared Error': 73.24613766737734, 'Root Mean Squared Error': 8.5583957414563, 'MAPE': 10.541157610313208}
--------------------------------
Evaluation Metrics - Item 512319115
{'Mean Absolute Error': 8.921389735557792, 'Mean Squared Error': 132.57286010187337, 'Root Mean Squared Error': 11.51402883885017, 'MAPE': 13.409345158532766}
--------------------------------
Evaluation Metrics - Item 512319119
{'Mean Absolute Error': 14.360913401502295, 'Mean Squared Error': 337.5489739598023, 'Root Mean Squared Error': 18.372505924881406, 'MAPE': 10.169008130231347}
--------------------------------
Evaluation Metrics - Item 512319130
{'Mean Absolute Error': 18.126922040148923, 'Mean Squared Error': 574.29534400132, 'Root Mean Squared Error': 23.964460018980606, 'MAPE': 6.6745676629694675}
--------------------------------
Evaluation Metrics - Item 512319152
{'Mean Absolute Error': 13.666388099172055, 'Mean Squared Error': 381.5032512517681, 'Root Mean Squared Error': 19.5321082131901, 'MAPE': 5.0721780096576765}
--------------------------------
Evaluation Metrics - Item 512319154
{'Mean Absolute Error': 23.40794234628681, 'Mean Squared Error': 954.7987054676463, 'Root Mean Squared Error': 30.899817240036327, 'MAPE': 4.795106782327685}
--------------------------------
Evaluation Metrics - Item 512319978
{'Mean Absolute Error': 10.581291530886212, 'Mean Squared Error': 193.0397179166874, 'Root Mean Squared Error': 13.893873395014344, 'MAPE': 3.787370020437501}
--------------------------------
Evaluation Metrics - Item 512319985
{'Mean Absolute Error': 15.52603527390196, 'Mean Squared Error': 379.34578128640106, 'Root Mean Squared Error': 19.476801105068592, 'MAPE': 5.675080992157515}
--------------------------------
Evaluation Metrics - Item 512320013
{'Mean Absolute Error': 17.974027920015534, 'Mean Squared Error': 515.1085772436824, 'Root Mean Squared Error': 22.69600355224863, 'MAPE': 12.385053638097105}
--------------------------------
Evaluation Metrics - Item 512320017
{'Mean Absolute Error': 24.694188407679647, 'Mean Squared Error': 985.998694807578, 'Root Mean Squared Error': 31.400616153311038, 'MAPE': 6.132850829267465}
--------------------------------
Evaluation Metrics - Item 512464613
{'Mean Absolute Error': 11.713697849841857, 'Mean Squared Error': 248.14265923590042, 'Root Mean Squared Error': 15.752544532103391, 'MAPE': 8.2134068862771}
--------------------------------
Evaluation Metrics - Item 512464615
{'Mean Absolute Error': 14.564395963428256, 'Mean Squared Error': 420.44480671982996, 'Root Mean Squared Error': 20.5047508329126, 'MAPE': 5.171636579661651}
--------------------------------
Evaluation Metrics - Item 512464625
{'Mean Absolute Error': 12.265783776212192, 'Mean Squared Error': 234.44419863710667, 'Root Mean Squared Error': 15.311570743627405, 'MAPE': 4.37559665580052}
--------------------------------
Evaluation Metrics - Item 512464633
{'Mean Absolute Error': 14.266348125786697, 'Mean Squared Error': 315.1560580137202, 'Root Mean Squared Error': 17.752635241386564, 'MAPE': 10.672356572223054}
--------------------------------
Evaluation Metrics - Item 512464642
{'Mean Absolute Error': 8.621075727654292, 'Mean Squared Error': 129.25128141157677, 'Root Mean Squared Error': 11.3688733571791, 'MAPE': 13.219637213451444}
--------------------------------
Evaluation Metrics - Item 512464646
{'Mean Absolute Error': 24.73315045116282, 'Mean Squared Error': 995.7470059768478, 'Root Mean Squared Error': 31.555459210362443, 'MAPE': 5.030875897701907}
--------------------------------
Evaluation Metrics - Item 512464651
{'Mean Absolute Error': 18.10979633369148, 'Mean Squared Error': 694.2473268428225, 'Root Mean Squared Error': 26.348573525768384, 'MAPE': 3.624606901794997}
--------------------------------
Evaluation Metrics - Item 512464658
{'Mean Absolute Error': 28.06251022022059, 'Mean Squared Error': 1143.6127407538231, 'Root Mean Squared Error': 33.817343786196794, 'MAPE': 6.818000846521353}
--------------------------------
Evaluation Metrics - Item 514002189
{'Mean Absolute Error': 12.78785992923423, 'Mean Squared Error': 265.83203903783084, 'Root Mean Squared Error': 16.30435644353468, 'MAPE': 4.475845974184713}
--------------------------------
Evaluation Metrics - Item 515375115
{'Mean Absolute Error': 9.921314176074167, 'Mean Squared Error': 165.84764709775718, 'Root Mean Squared Error': 12.878184930251514, 'MAPE': 18.066194209233792}
--------------------------------
Evaluation Metrics - Item 515702203
{'Mean Absolute Error': 25.170426585537626, 'Mean Squared Error': 875.55289653453, 'Root Mean Squared Error': 29.589743096798422, 'MAPE': 8.580331744073973}
--------------------------------
Evaluation Metrics - Item 515775902
{'Mean Absolute Error': 23.057513046661924, 'Mean Squared Error': 833.6844977434221, 'Root Mean Squared Error': 28.873595164846066, 'MAPE': 5.881696688662941}
--------------------------------
Evaluation Metrics - Item 515775912
{'Mean Absolute Error': 23.19294485806364, 'Mean Squared Error': 843.831027868785, 'Root Mean Squared Error': 29.048769816788887, 'MAPE': 5.8541032458647555}
--------------------------------
Evaluation Metrics - Item 515775929
{'Mean Absolute Error': 14.437474911151483, 'Mean Squared Error': 429.936375410342, 'Root Mean Squared Error': 20.734907171490836, 'MAPE': 10.377910455607926}
--------------------------------
Evaluation Metrics - Item 515775953
{'Mean Absolute Error': 25.042802097361232, 'Mean Squared Error': 1226.3643614857474, 'Root Mean Squared Error': 35.01948545432596, 'MAPE': 5.1158230813420555}
--------------------------------
Evaluation Metrics - Item 515775957
{'Mean Absolute Error': 14.087465998753492, 'Mean Squared Error': 366.5679834065386, 'Root Mean Squared Error': 19.145965199136306, 'MAPE': 10.43766395294944}
--------------------------------
Evaluation Metrics - Item 516001717
{'Mean Absolute Error': 14.017128968919538, 'Mean Squared Error': 322.5057719798903, 'Root Mean Squared Error': 17.958445700558006, 'MAPE': 20.812294645613168}
--------------------------------
Evaluation Metrics - Item 516001998
{'Mean Absolute Error': 17.20979715565102, 'Mean Squared Error': 468.35119760418, 'Root Mean Squared Error': 21.641423188047963, 'MAPE': 6.147985636392942}
--------------------------------
Evaluation Metrics - Item 516002000
{'Mean Absolute Error': 29.81897866127544, 'Mean Squared Error': 1261.9118405665602, 'Root Mean Squared Error': 35.52339849404277, 'MAPE': 6.756050596156158}
--------------------------------
Evaluation Metrics - Item 516007566
{'Mean Absolute Error': 20.19102242986535, 'Mean Squared Error': 693.0478817932609, 'Root Mean Squared Error': 26.325802585928145, 'MAPE': 6.603435185658832}
--------------------------------
In [66]:
combined_df_ARIMA = pd.DataFrame()

for index, item in enumerate(product_list):
    subset_promoted = filtered_df_promotion[filtered_df_promotion["item_id"] == item]
    try:
        predictions = models_ARIMA[index].predict(start=subset_promoted.index[0], end=subset_promoted.index[-1], dynamic=False)
    except KeyError:
        predictions = 0
        continue
    subset_promoted = subset_promoted.reset_index().rename(columns={'index': 'history_date'})
    subset_promoted["units_sold_counterfactual_ARIMA"] = predictions[1]
    combined_df_ARIMA = pd.concat([combined_df_ARIMA, subset_promoted], axis=0)

XGBoost Model¶

In [67]:
# XGBoost Model, Train Test Split and Model Evaluation

def XGBoost_train_test(df):
    
    df = df.reset_index().rename(columns={'index': 'history_date'})
    df['history_date'] = pd.to_datetime(df['history_date'])
    df['history_date'] = df['history_date'].astype(int)
    
    df['week'] = df['week'].astype(int)
    df['year'] = df['year'].astype(int)
    df['month'] = df['month'].astype(int)
    df['promo_type'] = df['promo_type'].astype(int)
    df['category_id'] = df['category_id'].astype(int)
    df['days_per_week_of_promotion'] = df['days_per_week_of_promotion'].astype(int)
    df['is_in_promotion'] = df['is_in_promotion'].astype(int)
    
    y = df["units_sold_counterfactual"]
    X = df.drop('units_sold_counterfactual', axis=1)
    
    # Splitting the data into train and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)

    # Creating the XGBoost DMatrix from the train set
    dtrain = xgb.DMatrix(X_train, label=y_train)

    # Creating the XGBoost DMatrix from the test set
    dtest = xgb.DMatrix(X_test)

    # Defining the XGBoost parameters
    params = {
        'objective': 'reg:squarederror',  # Assuming it's a regression problem
        'eval_metric': 'rmse',
        'max_depth': 3,
        'learning_rate': 0.1
    }

    # Training the XGBoost model
    model = xgb.train(params, dtrain)

    # Making predictions on the test set
    predictions = model.predict(dtest)   

    # Compute accuracy measures
    mse = np.mean((y_test - predictions) ** 2)  # Mean Squared Error
    rmse = np.sqrt(mse)  # Root Mean Squared Error
    mae = np.mean(np.abs(y_test - predictions))  # Mean Absolute Error
    mape = np.mean(np.abs((y_test - predictions) / y_test)) * 100  # Mean Absolute Percentage Error
    
    evaluation_metrics = {
                        'Mean Absolute Error': mae,
                        'Mean Squared Error': mse,
                        'Root Mean Squared Error': rmse,
                        'MAPE': mape,
                         }

    
    return predictions, evaluation_metrics, model
In [68]:
# Model Fitting XGBoost

models_XGBoost = []

for item in product_list:
    var_name_prediction_v2 = "prediction_" + str(item) + "_XGBoost"
    var_name_em_v2 = "evaluation_metrics_" + str(item) + "_XGBoost"
    globals()[var_name_prediction_v2], globals()[var_name_em_v2], XGBoost_model = XGBoost_train_test(filtered_df_non_promotion[filtered_df_non_promotion['item_id'] == item])
    models_XGBoost.append(XGBoost_model)
In [69]:
# Results of the Model Testing for each product

print("Evaluation Metrics - XGBoost")
print("--------------------------------")
print("--------------------------------")

for item in product_list:
    print(f"Evaluation Metrics - Item {item}")
    var_name = "evaluation_metrics_" + str(item) + "_XGBoost"
    print(globals()[var_name])
    print("--------------------------------")
Evaluation Metrics - XGBoost
--------------------------------
--------------------------------
Evaluation Metrics - Item 394846541
{'Mean Absolute Error': 176.32714080810547, 'Mean Squared Error': 32217.028427252546, 'Root Mean Squared Error': 179.49102603543315, 'MAPE': 35.275181799031216}
--------------------------------
Evaluation Metrics - Item 394848615
{'Mean Absolute Error': 101.67501831054688, 'Mean Squared Error': 10518.293723450042, 'Root Mean Squared Error': 102.55873304331544, 'MAPE': 35.233803513333314}
--------------------------------
Evaluation Metrics - Item 394851407
{'Mean Absolute Error': 105.12905883789062, 'Mean Squared Error': 11052.119012140669, 'Root Mean Squared Error': 105.12905883789062, 'MAPE': 41.389393243264024}
--------------------------------
Evaluation Metrics - Item 394857109
{'Mean Absolute Error': 104.09694353739421, 'Mean Squared Error': 11262.099516011105, 'Root Mean Squared Error': 106.12303951551287, 'MAPE': 35.63403966511623}
--------------------------------
Evaluation Metrics - Item 394858170
{'Mean Absolute Error': 48.73730278015137, 'Mean Squared Error': 2457.207043173461, 'Root Mean Squared Error': 49.57022335206349, 'MAPE': 34.59374107955503}
--------------------------------
Evaluation Metrics - Item 394860161
{'Mean Absolute Error': 114.44717407226562, 'Mean Squared Error': 13736.27967656497, 'Root Mean Squared Error': 117.2018757382533, 'MAPE': 37.68819372747828}
--------------------------------
Evaluation Metrics - Item 394865583
{'Mean Absolute Error': 222.68405151367188, 'Mean Squared Error': 49588.18679854367, 'Root Mean Squared Error': 222.68405151367188, 'MAPE': 44.09585178488552}
--------------------------------
Evaluation Metrics - Item 394866466
{'Mean Absolute Error': 47.14653015136719, 'Mean Squared Error': 2258.007874978939, 'Root Mean Squared Error': 47.51850034438102, 'MAPE': 33.87989767931945}
--------------------------------
Evaluation Metrics - Item 394873848
{'Mean Absolute Error': 27.935309886932373, 'Mean Squared Error': 808.8192574221666, 'Root Mean Squared Error': 28.43974784385696, 'MAPE': 36.08819686786704}
--------------------------------
Evaluation Metrics - Item 394882669
{'Mean Absolute Error': 260.6061553955078, 'Mean Squared Error': 67915.56823002757, 'Root Mean Squared Error': 260.6061553955078, 'MAPE': 51.401608559271764}
--------------------------------
Evaluation Metrics - Item 394885779
{'Mean Absolute Error': 203.31643676757812, 'Mean Squared Error': 41337.573459864594, 'Root Mean Squared Error': 203.31643676757812, 'MAPE': 46.63221026779315}
--------------------------------
Evaluation Metrics - Item 394885885
{'Mean Absolute Error': 44.29463195800781, 'Mean Squared Error': 2140.701920295367, 'Root Mean Squared Error': 46.26772006804925, 'MAPE': 33.01462521887584}
--------------------------------
Evaluation Metrics - Item 394890521
{'Mean Absolute Error': 180.68299865722656, 'Mean Squared Error': 32646.346003767336, 'Root Mean Squared Error': 180.68299865722656, 'MAPE': 43.748910086495535}
--------------------------------
Evaluation Metrics - Item 394893706
{'Mean Absolute Error': 26.995975971221924, 'Mean Squared Error': 780.8537779303388, 'Root Mean Squared Error': 27.943760983989588, 'MAPE': 36.07927820695879}
--------------------------------
Evaluation Metrics - Item 394904090
{'Mean Absolute Error': 140.84596252441406, 'Mean Squared Error': 19837.58515942865, 'Root Mean Squared Error': 140.84596252441406, 'MAPE': 46.94865417480469}
--------------------------------
Evaluation Metrics - Item 394907934
{'Mean Absolute Error': 24.0484561920166, 'Mean Squared Error': 587.2222941036562, 'Root Mean Squared Error': 24.232669974719173, 'MAPE': 34.19432387816137}
--------------------------------
Evaluation Metrics - Item 394909995
{'Mean Absolute Error': 97.43958282470703, 'Mean Squared Error': 9894.480781762395, 'Root Mean Squared Error': 99.47100472882737, 'MAPE': 34.53759297305289}
--------------------------------
Evaluation Metrics - Item 394914459
{'Mean Absolute Error': 60.32456970214844, 'Mean Squared Error': 3639.053709749365, 'Root Mean Squared Error': 60.32456970214844, 'MAPE': 42.78338276748116}
--------------------------------
Evaluation Metrics - Item 394915909
{'Mean Absolute Error': 143.43865966796875, 'Mean Squared Error': 21141.383462343365, 'Root Mean Squared Error': 145.40076843793972, 'MAPE': 34.565722408163154}
--------------------------------
Evaluation Metrics - Item 394917065
{'Mean Absolute Error': 167.91856486002604, 'Mean Squared Error': 29068.706646872994, 'Root Mean Squared Error': 170.49547397767776, 'MAPE': 34.22437427839635}
--------------------------------
Evaluation Metrics - Item 394917897
{'Mean Absolute Error': 23.410439014434814, 'Mean Squared Error': 589.4227926291587, 'Root Mean Squared Error': 24.278031069861466, 'MAPE': 33.213818265344955}
--------------------------------
Evaluation Metrics - Item 394924633
{'Mean Absolute Error': 253.10714721679688, 'Mean Squared Error': 64063.227972225286, 'Root Mean Squared Error': 253.10714721679688, 'MAPE': 48.581026337197095}
--------------------------------
Evaluation Metrics - Item 394930015
{'Mean Absolute Error': 26.505408763885498, 'Mean Squared Error': 748.100534246043, 'Root Mean Squared Error': 27.351426548647204, 'MAPE': 34.025365853594245}
--------------------------------
Evaluation Metrics - Item 394930651
{'Mean Absolute Error': 77.56673431396484, 'Mean Squared Error': 6016.598272133211, 'Root Mean Squared Error': 77.56673431396484, 'MAPE': 50.69721196991166}
--------------------------------
Evaluation Metrics - Item 394931359
{'Mean Absolute Error': 105.20796203613281, 'Mean Squared Error': 11250.324650796363, 'Root Mean Squared Error': 106.06754758547198, 'MAPE': 36.27464460853012}
--------------------------------
Evaluation Metrics - Item 394940184
{'Mean Absolute Error': 174.78903198242188, 'Mean Squared Error': 30969.705701352097, 'Root Mean Squared Error': 175.98211756127978, 'MAPE': 35.74288982980405}
--------------------------------
Evaluation Metrics - Item 394941031
{'Mean Absolute Error': 113.01422119140625, 'Mean Squared Error': 13235.948566500098, 'Root Mean Squared Error': 115.04759261496999, 'MAPE': 38.55770107921496}
--------------------------------
Evaluation Metrics - Item 394941377
{'Mean Absolute Error': 105.32774988810222, 'Mean Squared Error': 11604.884839610992, 'Root Mean Squared Error': 107.7259710543887, 'MAPE': 36.0233051624898}
--------------------------------
Evaluation Metrics - Item 394942170
{'Mean Absolute Error': 154.50106811523438, 'Mean Squared Error': 24229.189423748292, 'Root Mean Squared Error': 155.65728194899296, 'MAPE': 36.15418204074493}
--------------------------------
Evaluation Metrics - Item 394942631
{'Mean Absolute Error': 167.40731811523438, 'Mean Squared Error': 28526.81953353528, 'Root Mean Squared Error': 168.8988440858471, 'MAPE': 34.52077862810686}
--------------------------------
Evaluation Metrics - Item 394950597
{'Mean Absolute Error': 139.50213623046875, 'Mean Squared Error': 20361.096012864262, 'Root Mean Squared Error': 142.69231238179674, 'MAPE': 33.67999062739233}
--------------------------------
Evaluation Metrics - Item 394951176
{'Mean Absolute Error': 178.39956665039062, 'Mean Squared Error': 32063.139756047167, 'Root Mean Squared Error': 179.06183221459332, 'MAPE': 36.09886281001536}
--------------------------------
Evaluation Metrics - Item 395052168
{'Mean Absolute Error': 184.52191162109375, 'Mean Squared Error': 34562.08586830273, 'Root Mean Squared Error': 185.9088106258085, 'MAPE': 36.81419081158684}
--------------------------------
Evaluation Metrics - Item 395357341
{'Mean Absolute Error': 47.778075218200684, 'Mean Squared Error': 2334.528572999807, 'Root Mean Squared Error': 48.31695947594185, 'MAPE': 33.50065150102261}
--------------------------------
Evaluation Metrics - Item 395368886
{'Mean Absolute Error': 52.10391461407697, 'Mean Squared Error': 2820.055949688816, 'Root Mean Squared Error': 53.10419898359089, 'MAPE': 35.36623663371547}
--------------------------------
Evaluation Metrics - Item 395375136
{'Mean Absolute Error': 102.26250457763672, 'Mean Squared Error': 10945.352734416942, 'Root Mean Squared Error': 104.62003983184552, 'MAPE': 34.76825245376594}
--------------------------------
Evaluation Metrics - Item 395382145
{'Mean Absolute Error': 50.70055389404297, 'Mean Squared Error': 2757.2412993174657, 'Root Mean Squared Error': 52.509440097162205, 'MAPE': 34.940971712389526}
--------------------------------
Evaluation Metrics - Item 395384129
{'Mean Absolute Error': 63.05556392669678, 'Mean Squared Error': 4204.186458945078, 'Root Mean Squared Error': 64.83969817129841, 'MAPE': 38.57183394757225}
--------------------------------
Evaluation Metrics - Item 511584598
{'Mean Absolute Error': 100.18876647949219, 'Mean Squared Error': 10179.023303682217, 'Root Mean Squared Error': 100.89114581410114, 'MAPE': 35.33485494964235}
--------------------------------
Evaluation Metrics - Item 512317690
{'Mean Absolute Error': 51.25790051051548, 'Mean Squared Error': 2782.4022340607667, 'Root Mean Squared Error': 52.74848086969678, 'MAPE': 34.96200528732283}
--------------------------------
Evaluation Metrics - Item 512317697
{'Mean Absolute Error': 94.65811864341178, 'Mean Squared Error': 9155.147499090952, 'Root Mean Squared Error': 95.68253497421017, 'MAPE': 34.33791976980975}
--------------------------------
Evaluation Metrics - Item 512317702
{'Mean Absolute Error': 99.8319811139788, 'Mean Squared Error': 10330.7070778137, 'Root Mean Squared Error': 101.64008597897633, 'MAPE': 34.640660745425365}
--------------------------------
Evaluation Metrics - Item 512317726
{'Mean Absolute Error': 27.952558878305798, 'Mean Squared Error': 828.6632334090674, 'Root Mean Squared Error': 28.78651131014433, 'MAPE': 35.82995677203659}
--------------------------------
Evaluation Metrics - Item 512317737
{'Mean Absolute Error': 126.65196228027344, 'Mean Squared Error': 16852.969549443806, 'Root Mean Squared Error': 129.81898763063825, 'MAPE': 33.16927023078128}
--------------------------------
Evaluation Metrics - Item 512317760
{'Mean Absolute Error': 22.058460235595703, 'Mean Squared Error': 544.0494418426824, 'Root Mean Squared Error': 23.32486745605819, 'MAPE': 30.290590906490138}
--------------------------------
Evaluation Metrics - Item 512317763
{'Mean Absolute Error': 26.343447549002512, 'Mean Squared Error': 707.283602510924, 'Root Mean Squared Error': 26.594804050996952, 'MAPE': 35.25550199695525}
--------------------------------
Evaluation Metrics - Item 512319115
{'Mean Absolute Error': 24.39511095244309, 'Mean Squared Error': 634.255694882453, 'Root Mean Squared Error': 25.18443358272036, 'MAPE': 34.0766708355501}
--------------------------------
Evaluation Metrics - Item 512319119
{'Mean Absolute Error': 50.513123732346756, 'Mean Squared Error': 2764.87764255214, 'Root Mean Squared Error': 52.58210382394508, 'MAPE': 34.7056347812487}
--------------------------------
Evaluation Metrics - Item 512319130
{'Mean Absolute Error': 96.46749763488769, 'Mean Squared Error': 9546.921703144379, 'Root Mean Squared Error': 97.70835022220147, 'MAPE': 34.739541742221206}
--------------------------------
Evaluation Metrics - Item 512319152
{'Mean Absolute Error': 93.520690373012, 'Mean Squared Error': 8972.704253208716, 'Root Mean Squared Error': 94.72435934440895, 'MAPE': 34.18763615108665}
--------------------------------
Evaluation Metrics - Item 512319154
{'Mean Absolute Error': 172.9444134051983, 'Mean Squared Error': 30744.48114694705, 'Root Mean Squared Error': 175.34104239152637, 'MAPE': 34.76543126058006}
--------------------------------
Evaluation Metrics - Item 512319978
{'Mean Absolute Error': 98.60025376539964, 'Mean Squared Error': 9808.393402260603, 'Root Mean Squared Error': 99.03733337615974, 'MAPE': 34.20817048720958}
--------------------------------
Evaluation Metrics - Item 512319985
{'Mean Absolute Error': 95.86808242797852, 'Mean Squared Error': 9351.67885724539, 'Root Mean Squared Error': 96.70407880356127, 'MAPE': 34.38345082939451}
--------------------------------
Evaluation Metrics - Item 512320013
{'Mean Absolute Error': 52.460604759954634, 'Mean Squared Error': 2908.880763624487, 'Root Mean Squared Error': 53.934040861263924, 'MAPE': 35.100066739504804}
--------------------------------
Evaluation Metrics - Item 512320017
{'Mean Absolute Error': 143.38466842086225, 'Mean Squared Error': 21118.653964388082, 'Root Mean Squared Error': 145.32258587152955, 'MAPE': 34.86042696174399}
--------------------------------
Evaluation Metrics - Item 512464613
{'Mean Absolute Error': 50.64953585024233, 'Mean Squared Error': 2668.575278820024, 'Root Mean Squared Error': 51.658254701644964, 'MAPE': 34.75423596646599}
--------------------------------
Evaluation Metrics - Item 512464615
{'Mean Absolute Error': 99.05093048840034, 'Mean Squared Error': 9959.008515058738, 'Root Mean Squared Error': 99.79483210596999, 'MAPE': 35.09498913965339}
--------------------------------
Evaluation Metrics - Item 512464625
{'Mean Absolute Error': 99.26132906400241, 'Mean Squared Error': 9974.294282930867, 'Root Mean Squared Error': 99.87138871033518, 'MAPE': 34.689057724909425}
--------------------------------
Evaluation Metrics - Item 512464633
{'Mean Absolute Error': 47.57310812813895, 'Mean Squared Error': 2390.6169964881847, 'Root Mean Squared Error': 48.89393619344003, 'MAPE': 33.8218167620996}
--------------------------------
Evaluation Metrics - Item 512464642
{'Mean Absolute Error': 22.732900472787712, 'Mean Squared Error': 536.3661667155136, 'Root Mean Squared Error': 23.159580452061597, 'MAPE': 35.10702896558936}
--------------------------------
Evaluation Metrics - Item 512464646
{'Mean Absolute Error': 171.77658194082755, 'Mean Squared Error': 30194.96605310589, 'Root Mean Squared Error': 173.76698781156878, 'MAPE': 34.744291414305685}
--------------------------------
Evaluation Metrics - Item 512464651
{'Mean Absolute Error': 176.2540249294705, 'Mean Squared Error': 31570.869148242633, 'Root Mean Squared Error': 177.68193253182113, 'MAPE': 35.273797520653495}
--------------------------------
Evaluation Metrics - Item 512464658
{'Mean Absolute Error': 147.81852326569734, 'Mean Squared Error': 22457.876291007044, 'Root Mean Squared Error': 149.85952185632732, 'MAPE': 34.95660030976013}
--------------------------------
Evaluation Metrics - Item 514002189
{'Mean Absolute Error': 101.74978468153212, 'Mean Squared Error': 10563.774047482992, 'Root Mean Squared Error': 102.7802220637949, 'MAPE': 36.35645109933443}
--------------------------------
Evaluation Metrics - Item 515375115
{'Mean Absolute Error': 21.747573175737934, 'Mean Squared Error': 504.2388058038427, 'Root Mean Squared Error': 22.455262318749313, 'MAPE': 34.509276566882136}
--------------------------------
Evaluation Metrics - Item 515702203
{'Mean Absolute Error': 112.0943359375, 'Mean Squared Error': 13444.655149269103, 'Root Mean Squared Error': 115.95108946995325, 'MAPE': 37.34390278490967}
--------------------------------
Evaluation Metrics - Item 515775902
{'Mean Absolute Error': 131.41074250873766, 'Mean Squared Error': 17583.88532566529, 'Root Mean Squared Error': 132.60424324155426, 'MAPE': 32.14128988642172}
--------------------------------
Evaluation Metrics - Item 515775912
{'Mean Absolute Error': 139.52304967244467, 'Mean Squared Error': 19923.305057349342, 'Root Mean Squared Error': 141.14993821234688, 'MAPE': 33.5799139557453}
--------------------------------
Evaluation Metrics - Item 515775929
{'Mean Absolute Error': 49.39058335622152, 'Mean Squared Error': 2596.4150761852484, 'Root Mean Squared Error': 50.955029939989714, 'MAPE': 35.20774817838757}
--------------------------------
Evaluation Metrics - Item 515775953
{'Mean Absolute Error': 172.573361714681, 'Mean Squared Error': 30624.89150931399, 'Root Mean Squared Error': 174.99969002633688, 'MAPE': 34.99225728417554}
--------------------------------
Evaluation Metrics - Item 515775957
{'Mean Absolute Error': 49.91384410858154, 'Mean Squared Error': 2613.6157725299877, 'Root Mean Squared Error': 51.12353442916469, 'MAPE': 34.976173332160855}
--------------------------------
Evaluation Metrics - Item 516001717
{'Mean Absolute Error': 26.112015892477597, 'Mean Squared Error': 789.8533008846391, 'Root Mean Squared Error': 28.10432886380031, 'MAPE': 33.99475898534736}
--------------------------------
Evaluation Metrics - Item 516001998
{'Mean Absolute Error': 98.33987602820763, 'Mean Squared Error': 9920.581499773803, 'Root Mean Squared Error': 99.60211594024398, 'MAPE': 34.545392754452834}
--------------------------------
Evaluation Metrics - Item 516002000
{'Mean Absolute Error': 152.8120856651893, 'Mean Squared Error': 23825.170996692927, 'Root Mean Squared Error': 154.35404431595865, 'MAPE': 35.31236000207735}
--------------------------------
Evaluation Metrics - Item 516007566
{'Mean Absolute Error': 109.72551935369319, 'Mean Squared Error': 12612.817953648891, 'Root Mean Squared Error': 112.30680279328092, 'MAPE': 36.80310976446926}
--------------------------------
In [70]:
combined_df_XGBoost = pd.DataFrame()

for index, item in enumerate(product_list):
    
    subset_promoted = filtered_df_promotion[filtered_df_promotion["item_id"] == item]
    subset_promoted_v1 = subset_promoted.copy()
    subset_promoted_v1 = subset_promoted_v1.reset_index().rename(columns={'index': 'history_date'})
    subset_promoted_v1['history_date'] = pd.to_datetime(subset_promoted_v1['history_date'])
    subset_promoted_v1['history_date'] = subset_promoted_v1['history_date'].astype(int)    
    subset_promoted_v1['week'] = subset_promoted_v1['week'].astype(int)
    subset_promoted_v1['year'] = subset_promoted_v1['year'].astype(int)
    subset_promoted_v1['month'] = subset_promoted_v1['month'].astype(int)
    subset_promoted_v1['promo_type'] = subset_promoted_v1['promo_type'].astype(int)
    subset_promoted_v1['category_id'] = subset_promoted_v1['category_id'].astype(int)
    subset_promoted_v1['days_per_week_of_promotion'] = subset_promoted_v1['days_per_week_of_promotion'].astype(int)
    subset_promoted_v1['is_in_promotion'] = subset_promoted_v1['is_in_promotion'].astype(int)
    subset_promoted_v1 = subset_promoted_v1.drop('units_sold_counterfactual', axis=1)
    
    dtest = xgb.DMatrix(subset_promoted_v1)
    predictions = models_XGBoost[index].predict(dtest) 
    subset_promoted = subset_promoted.reset_index().rename(columns={'index': 'history_date'})
    subset_promoted["units_sold_counterfactual_XGBoost"] = predictions
    combined_df_XGBoost = pd.concat([combined_df_XGBoost, subset_promoted], axis=0)
In [71]:
merged_df_SARIMA_ARIMA = combined_df_SARIMA.merge(combined_df_ARIMA.drop(columns=['week', 'month', 'year', 'units_sold', 'units_sold_counterfactual', 'sales', 'inventory', 'promo_type', 'price', 'category_id', 'days_per_week_of_promotion','counterfactual_price','is_in_promotion']), on=['history_date', 'item_id'], how='inner')
In [72]:
merged_df_SARIMA_ARIMA_XGBoost = combined_df_XGBoost.merge(merged_df_SARIMA_ARIMA.drop(columns=['week', 'month', 'year', 'units_sold', 'units_sold_counterfactual', 'sales', 'inventory', 'promo_type', 'price', 'category_id', 'days_per_week_of_promotion','counterfactual_price','is_in_promotion']), on=['history_date', 'item_id'], how='inner')
In [73]:
merged_df_SARIMA_ARIMA_XGBoost.head()
Out[73]:
history_date item_id week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion units_sold_counterfactual_XGBoost units_sold_counterfactual_SARIMA units_sold_counterfactual_ARIMA
0 2015-11-16 394846541 47 11 2015 537.0 537.0 0.095500 -0.724154 0 -0.303045 3 1 -0.30601 1 321.975037 474.184725 494.745414
1 2015-11-23 394846541 48 11 2015 525.0 525.0 0.078261 -0.724154 0 -0.303045 3 2 -0.30601 1 321.975037 478.551325 494.745414
2 2015-11-30 394846541 49 11 2015 584.0 584.0 0.167508 -0.965977 0 -0.303045 3 2 -0.30601 1 321.975037 489.609441 494.745414
3 2015-12-07 394846541 50 12 2015 573.0 573.0 0.150345 -1.036640 0 -0.303045 3 2 -0.30601 1 321.975037 474.025938 494.745414
4 2015-12-14 394846541 51 12 2015 602.0 602.0 0.194582 -1.048417 0 -0.303045 3 1 -0.30601 1 321.975037 483.564626 494.745414
In [74]:
filtered_df_non_promotion["units_sold_counterfactual_SARIMA"] = filtered_df_non_promotion["units_sold_counterfactual"]
filtered_df_non_promotion["units_sold_counterfactual_ARIMA"] = filtered_df_non_promotion["units_sold_counterfactual"]  
filtered_df_non_promotion["units_sold_counterfactual_XGBoost"] = filtered_df_non_promotion["units_sold_counterfactual"]  
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/1989425113.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df_non_promotion["units_sold_counterfactual_SARIMA"] = filtered_df_non_promotion["units_sold_counterfactual"]
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/1989425113.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df_non_promotion["units_sold_counterfactual_ARIMA"] = filtered_df_non_promotion["units_sold_counterfactual"]
/var/folders/mn/sbl7d2753k5cw1h2tqhxptdc0000gn/T/ipykernel_21158/1989425113.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df_non_promotion["units_sold_counterfactual_XGBoost"] = filtered_df_non_promotion["units_sold_counterfactual"]
In [75]:
merged_df_SARIMA_ARIMA_XGBoost.set_index('history_date', inplace=True)
In [76]:
final_dataset = pd.concat([filtered_df_non_promotion, merged_df_SARIMA_ARIMA_XGBoost])
In [77]:
final_dataset = final_dataset.reset_index().rename(columns={'index': 'history_date'})
In [78]:
final_dataset.head(10)
Out[78]:
history_date item_id week month year units_sold units_sold_counterfactual sales inventory promo_type price category_id days_per_week_of_promotion counterfactual_price is_in_promotion units_sold_counterfactual_SARIMA units_sold_counterfactual_ARIMA units_sold_counterfactual_XGBoost
0 2014-01-06 394846541 2 1 2014 418.0 418.0 -0.085062 -0.539646 0 -0.303045 3 0 -0.30601 0 418.0 418.0 418.0
1 2014-01-13 394846541 3 1 2014 515.0 515.0 0.060938 -0.751634 0 -0.303045 3 0 -0.30601 0 515.0 515.0 515.0
2 2014-01-20 394846541 4 1 2014 528.0 528.0 0.078816 -0.845066 0 -0.303045 3 0 -0.30601 0 528.0 528.0 528.0
3 2014-01-27 394846541 5 1 2014 491.0 491.0 0.025183 -0.845066 0 -0.303045 3 0 -0.30601 0 491.0 491.0 491.0
4 2014-02-03 394846541 6 2 2014 512.0 512.0 0.054979 -0.845066 0 -0.303045 3 0 -0.30601 0 512.0 512.0 512.0
5 2014-02-10 394846541 7 2 2014 515.0 515.0 0.059448 -0.965977 0 -0.303045 3 0 -0.30601 0 515.0 515.0 515.0
6 2014-02-17 394846541 8 2 2014 503.0 503.0 0.041571 -0.965977 0 -0.303045 3 0 -0.30601 0 503.0 503.0 503.0
7 2014-02-24 394846541 9 2 2014 490.0 490.0 0.023693 -0.965977 0 -0.303045 3 0 -0.30601 0 490.0 490.0 490.0
8 2014-03-03 394846541 10 3 2014 488.0 488.0 0.019224 -1.133212 0 -0.303045 3 0 -0.30601 0 488.0 488.0 488.0
9 2014-03-10 394846541 11 3 2014 501.0 501.0 0.038591 -0.520803 0 -0.303045 3 0 -0.30601 0 501.0 501.0 501.0
In [79]:
mse_SARIMA = np.mean((final_dataset["units_sold"] - final_dataset["units_sold_counterfactual_SARIMA"]) ** 2)  # Mean Squared Error
mse_ARIMA = np.mean((final_dataset["units_sold"] - final_dataset["units_sold_counterfactual_ARIMA"]) ** 2)  # Mean Squared Error
mse_XGBoost = np.mean((final_dataset["units_sold"] - final_dataset["units_sold_counterfactual_XGBoost"]) ** 2)  # Mean Squared Error
In [80]:
print("Mean Squared Error - SARIMA", mse_SARIMA)
print("Mean Squared Error - ARIMA", mse_ARIMA)
print("Mean Squared Error - XGBoost", mse_XGBoost)
Mean Squared Error - SARIMA 5716.071005512629
Mean Squared Error - ARIMA 5517.376491503199
Mean Squared Error - XGBoost 8798.897759200265

SARIMA model outperformed ARIMA and XGBoost.

Data Visualisation¶

In [85]:
category = 6
df = final_dataset[final_dataset['category_id']==category]
df = df[df['is_in_promotion']== True]

plt.figure(figsize=(10, 8))

# Plotting the first seaborn plot on the first subplot
sns.barplot(data=df, x='item_id', y='units_sold',color='red', label='units sold on promotion')

# Plotting the second seaborn plot on the second subplot
sns.barplot(data=df, x='item_id', y='units_sold_counterfactual_SARIMA',color='blue',alpha=0.7, label='units sold (counterfactual model)')

plt.ylabel("units sold")
plt.tick_params(axis='x',rotation=45)
plt.legend()
plt.title(f"Units Sold With or Without Promotion of Category {category} Products")
plt.show()

The plot above represents units sold during promotion and out of promotion periods for products belonging from category 6. The CI is widest for units sold in promotion period in comparison to sold units in non promotion period.

Plots of lift ration (units sold on promo / baseline)¶

In [86]:
only_promotion_period = final_dataset[final_dataset["is_in_promotion"] == True]
only_promotion_period=only_promotion_period.groupby("item_id").agg({
    "units_sold":'sum',
    'units_sold_counterfactual_SARIMA':'sum',
    'category_id':'last',
    'promo_type':'last'}).reset_index()

only_promotion_period['lift'] = only_promotion_period['units_sold'] / only_promotion_period["units_sold_counterfactual_SARIMA"]
only_promotion_period.head()
Out[86]:
item_id units_sold units_sold_counterfactual_SARIMA category_id promo_type lift
0 394846541 2821.0 2399.936055 3 0 1.175448
1 394848615 2319.0 1709.159512 2 1 1.356807
2 394851407 2486.0 1025.687471 2 1 2.423740
3 394857109 1999.0 1507.328262 1 1 1.326188
4 394858170 819.0 712.117491 6 0 1.150091
In [89]:
bar_width = 0.35

# Plot the stacked bar chart
plt.figure(figsize=(8, 5))
plt.bar(only_promotion_period['category_id'], only_promotion_period['units_sold_counterfactual_SARIMA'], width=bar_width, label='Units sold (counterfactual model)')
plt.bar(only_promotion_period['category_id'], only_promotion_period['units_sold'], width=bar_width, label='Units Sold',bottom=only_promotion_period['units_sold_counterfactual_SARIMA'])
plt.xlabel('Category')
plt.ylabel('Units')
plt.title('Units Sold vs Units Predicted by Category')
plt.legend()
plt.show()

Actual units sold are higher than the units sold predicted by the counterfactual model, which indicates the effectiveness of running the promotion campaigns on the sales.

In [92]:
# Setting the plot style
sns.set(style="whitegrid")

# Plotting the chart
plt.figure(figsize=(12, 6))
graph = sns.barplot(
    x = 'category_id',
    y ='lift',
    data=only_promotion_period,
    ci = 'sd',  # Display confidence intervals based on the standard deviation
    #hue="is_in_promotion",
    capsize=0.1,  # Length of the caps on error bars
    errwidth=1.5,
)

graph.axhline(y=1, color='red', linestyle='--',linewidth=2, alpha=0.4)

for container in graph.containers:
    plt.bar_label(container, label_type='edge', color='black',weight = 'bold')
    plt.xticks(rotation=45)
    plt.xlabel('Product Category')
    plt.ylabel('Lift Ratio')
    plt.title('Lift Ratio with Confidence Intervals for Each Product Category')
    plt.show()

Lift ratio is the highest for category 4 whereas it is the least for category 5. The confidence interval of category 4 is the widest while category 5 has the narrowest confidence interval.

In [93]:
# Setting the plot style
sns.set(style="whitegrid")

# Plotting the chart
plt.figure(figsize=(12, 6))
graph = sns.barplot(
    x='promo_type',
    y='lift',
    data=only_promotion_period,
    ci ='sd',  # Display confidence intervals based on the standard deviation
    #hue="is_in_promotion",
    capsize=0.1,
    errwidth=1.5,
)
plt.xticks(rotation=45)
plt.xlabel('Promotion type')
plt.ylabel('Lift Ratio')
plt.title('Lift Ratio with Confidence Intervals for Each Promo type')

graph.axhline(y=1, color='red', linestyle='--',linewidth=2, alpha=0.4)

for container in graph.containers:
    plt.bar_label(container, label_type='edge', color='black',weight = 'bold')

plt.show()

Promo Type 2 proves the most effective since it has incurs the highest lift ratio and also has the widest confidence interval whereas promo type 0 is the least effect one because it has the least lift ratio and its confidence interval is also the narrowest.

In [94]:
# Setting the plot style
sns.set(style="whitegrid")

# Plotting the chart
plt.figure(figsize=(12, 6))
graph = sns.barplot(
    x = 'category_id',
    y = 'lift',
    data = only_promotion_period,
    ci = 'sd',  # Display confidence intervals based on the standard deviation
    hue="promo_type",
    capsize=0.1,  # Length of the caps on error bars
)

graph.axhline(y=1, color='red', linestyle='--',linewidth=2, alpha=0.4)
plt.xticks(rotation=45)
plt.xlabel('Product Category')
plt.ylabel('Lift Ratio')
plt.title('Lift Ratio with Confidence Intervals for Each Product Category (by promotion)')
plt.show()

The promotion proves most effective with promo type 2 in general but specifically speaking category 4 out performs all categories with the best lift ratio given that the promo type is 2.

  • Promo type 0 positively affects categories 5 and 6 the most and it least affects category 4.
  • Promo type 1 positively affects categories 2 and 6 the most and it least affects category 1.
  • Promo type 2 positively affects categories 4 the most and it least affects category 6.

A summary of the work¶

The rationale behind the choice of model(s) for this problem.¶

The SARIMA (Seasonal Autoregressive Integrated Moving Average) model has several advantages that make it a popular choice for time series forecasting:

  • Seasonality Handling: SARIMA models are specifically designed to handle seasonal patterns in time series data. They can capture and model the seasonal component of the data, allowing for accurate forecasting of future seasonal fluctuations.

  • Flexibility: SARIMA models offer flexibility in capturing both the non-seasonal and seasonal components of a time series. They can handle various combinations of autoregressive (AR), moving average (MA), and differencing (I) terms to capture different patterns and trends in the data.

  • Accuracy: SARIMA models can provide accurate forecasts, especially when applied to time series data with clear patterns and seasonality. By incorporating both the autoregressive and moving average components, SARIMA models can capture the dependencies and fluctuations in the data, resulting in more accurate predictions.

  • Forecasting Horizon: SARIMA models can effectively forecast multiple time steps ahead, making them suitable for both short-term and long-term forecasting tasks. By considering the historical patterns and trends in the data, SARIMA models can project future values over extended time periods.

  • Interpretability: SARIMA models provide interpretable parameters that can help understand the underlying patterns and dynamics in the time series. The coefficients of the autoregressive and moving average terms reflect the impact of past values and the moving average shocks on the current observation.

  • Statistical Framework: SARIMA models are based on a solid statistical framework, making them well-suited for analyzing and modeling time series data. They leverage concepts such as stationarity, autocorrelation, and partial autocorrelation, which are fundamental in understanding and modeling time-dependent data.

  • Availability: SARIMA models are widely available in statistical software packages and libraries, such as statsmodels in Python and the forecast package in R. These tools provide efficient implementations of SARIMA algorithms, making it easier to build and evaluate SARIMA models.

ARIMA (Autoregressive Integrated Moving Average) models offer several advantages for time series analysis:

  • Simple and interpretable: ARIMA models have a straightforward structure and are relatively easy to understand and interpret. They are based on the autoregressive (AR), differencing (I), and moving average (MA) components, which can be easily explained and related to the underlying data patterns.

  • Flexibility in capturing different trends: ARIMA models can capture various types of trends and patterns in time series data, including linear, non-linear, and seasonal trends. By adjusting the orders of the AR, I, and MA components, ARIMA models can be tailored to capture specific characteristics of the data.

  • Effective in handling stationarity: ARIMA models are designed to handle non-stationary time series data by differencing the series to make it stationary. This differencing step helps remove trends and seasonality, making the data suitable for modeling.

  • Good performance with moderate-sized datasets: ARIMA models perform well when working with moderate-sized datasets. They can handle time series data with hundreds or thousands of observations and provide reliable forecasts.

  • Established and widely used: ARIMA models have been around for a long time and are widely used in various fields, including economics, finance, and social sciences. They have a solid theoretical foundation and are supported by extensive research and literature.

  • Model diagnostics and interpretation: ARIMA models provide diagnostic tools such as residual analysis, ACF, PACF plots, and statistical tests to assess model fit and identify potential issues. These diagnostics help in understanding the model's performance and guide model improvements.

  • Forecasting capabilities: ARIMA models are primarily used for time series forecasting. They can provide accurate predictions for future values based on historical data and identified patterns. The models can be updated and retrained as new data becomes available, allowing for dynamic forecasting.

XGBoost, an optimized gradient boosting framework, offers several advantages for time series analysis:

  • Handles non-linear relationships: XGBoost is capable of capturing complex non-linear relationships between variables, making it suitable for modeling time series data with intricate patterns and trends.

  • Feature importance: XGBoost provides a feature importance metric, which helps identify the most influential variables in the model. This information can be valuable for understanding the underlying drivers of the time series and making informed decisions.

  • Handles missing values: XGBoost can handle missing values in the dataset by using a default direction for missing values during the tree construction process. This allows for efficient handling of missing data without requiring imputation or data preprocessing.

  • Automatic handling of outliers: XGBoost has the ability to automatically handle outliers by assigning them to the appropriate leaf nodes during the tree construction process. This helps prevent outliers from having a disproportionate impact on the model.

  • Regularization: XGBoost includes regularization techniques such as L1 and L2 regularization, which help prevent overfitting and improve the generalization performance of the model. Regularization is especially useful for time series analysis, as it helps prevent the model from fitting too closely to the training data and better captures the underlying patterns.

  • Parallel processing: XGBoost has built-in support for parallel processing, allowing for faster training and prediction on large datasets. This is particularly advantageous when working with time series data that may have a high number of observations or features.

  • Flexibility in objective functions: XGBoost offers flexibility in choosing the objective function to optimize during model training. This enables customization of the model to specific time series forecasting tasks, such as optimizing for mean squared error (MSE), mean absolute error (MAE), or other evaluation metrics based on the specific requirements.

  • Ensemble of weak models: XGBoost utilizes an ensemble approach by combining multiple weak models (decision trees) to create a stronger model. This ensemble strategy helps improve the predictive accuracy and robustness of the time series model.

  • It's important to note that XGBoost may require careful feature engineering, including lagged variables and other time-related features, to effectively capture temporal dependencies in the time series data. Additionally, tuning the hyperparameters of XGBoost is crucial for obtaining optimal model performance.

The Prophet model offers several advantages for time series forecasting:

  • Flexibility: Prophet is designed to handle a wide range of time series data, including those with irregularities, missing values, and outliers. It can effectively handle data with various patterns, trends, and seasonality.

  • Automatic Seasonality Detection: Prophet can automatically detect and model multiple seasonalities in the data, including daily, weekly, monthly, and yearly patterns. This makes it suitable for analyzing and forecasting time series with complex seasonal variations.

  • Incorporation of Holiday Effects: Prophet allows the inclusion of custom holiday effects, enabling the model to capture the impact of holidays or specific events on the time series. This is particularly useful for industries where holidays or special events have a significant influence on the data.

  • Intuitive and Interpretable: The model provides easily interpretable results, making it accessible to users with varying levels of technical expertise. It generates visualizations that help understand the underlying patterns in the data, including trend, seasonality, and holiday effects.

  • Robustness to Outliers: Prophet employs a robust fitting procedure that can handle outliers and abrupt changes in the time series. It minimizes the impact of extreme values on the overall model performance, making it more reliable in the presence of anomalous data points.

  • Scalability: Prophet is designed to handle large-scale time series datasets efficiently. It can handle thousands of time series with high frequency, making it suitable for applications where forecasting needs to be performed on a large number of individual series.

  • Open-source and Well-documented: Prophet is an open-source library developed by Facebook's Core Data Science team. It has gained popularity in the data science community and is actively maintained. The availability of extensive documentation, tutorials, and examples makes it easy to understand and implement.

A brief summary of the most important features for promotion effectiveness, and an explanation of how you arrived at that conclusion.¶

The most important features according to the correlation analysis were days_per_week_promotion, sales, and is_in_promotion. These variables were most correlated with the target variable hence, they hold great importance in determining the effectiveness of the promotion campaign.

Any additional thoughts¶

What other types of data could be requested from the client? (1) to improve the accuracy of the models, (2) to answer other business questions.¶

To improve the accuracy of the models and answer other business questions related to the impact of promotion on sales, you could request the following types of data from the client:

  • Promotion Data:

Detailed information about the promotions, such as promotion type, duration, discount amount, coupon codes, etc. Promotion start and end dates, including any overlapping or concurrent promotions. Historical data on past promotions, including sales data during promotional periods.

  • Customer Data:

Customer demographics, such as age, gender, location, etc. Customer behavior data, including purchase history, frequency of purchases, average order value, etc. Customer segmentation data, if available.

  • Product Data:

Detailed information about the products, including product attributes, categories, variants, etc. Historical sales data for individual products or product categories.

  • External Factors:

Economic indicators, such as GDP, inflation rate, consumer confidence index, etc. Weather data, if applicable, as weather conditions can impact sales. Industry-specific factors or market trends that may influence sales.

  • Competitor Data:

Information about competitor promotions, pricing strategies, and market share. Competitive analysis data, such as competitor sales performance and marketing activities.

  • Seasonality and Events:

Calendar of holidays, festivals, and special events that could affect sales. Information about seasonal trends, peak shopping periods, or recurring patterns.

  • Inventory and Supply Chain Data:

Inventory levels and stock availability. Delivery and lead times for products.

  • Customer Feedback and Surveys:

Customer feedback and reviews that provide insights into their preferences and satisfaction levels. Results from customer surveys or market research studies.

What could be the reason(s) that we forecast sales in units instead of in dollars? Can you think of¶

There are several reasons why forecasting sales in units rather than in dollars might be preferred:

  • Product Pricing Variability: If the pricing of the products is subject to frequent changes or variability, forecasting in units allows for a more stable and accurate prediction of sales volume. Unit sales are less affected by pricing fluctuations, making it easier to capture underlying demand patterns.

  • Product Mix and Assortment: Forecasting in units allows for a better understanding of the demand for different product categories or variants. It helps identify which specific products are driving sales and allows for effective inventory management and production planning.

  • Seasonality and Promotions: Unit sales capture the seasonal demand patterns and the impact of promotional activities more effectively. Seasonal fluctuations and promotional effects are often driven by changes in the number of units sold rather than the dollar value.

  • Consistency Across Channels: When forecasting sales across multiple channels or locations, units provide a consistent measure that can be easily compared and aggregated. Different pricing structures or currency exchange rates can introduce complexities when forecasting in dollars.

  • Benchmarking and Comparisons: Forecasting in units allows for benchmarking and comparisons with industry standards or competitors. Unit sales provide a more standardized metric for evaluating performance and market share.

  • Operational Planning: Forecasting in units facilitates operational planning, such as production scheduling, supply chain management, and capacity utilization. It provides insights into the volume of products that need to be produced, stocked, or distributed.

It's important to note that forecasting in dollars can still be valuable, especially when considering revenue, profitability, and financial planning. However, forecasting in units offers specific advantages in capturing demand patterns, understanding product performance, and facilitating operational decision-making. The choice between forecasting in units or dollars depends on the specific objectives, nature of the business, and available data.

Business application for which forecasting in dollars makes more sense?¶

Forecasting in dollars can be particularly useful in various business applications where the financial perspective is crucial. Here are some examples:

  • Revenue Forecasting: For businesses focused on revenue generation, forecasting in dollars provides a direct estimation of the expected monetary inflow. It helps in budgeting, financial planning, and assessing the overall financial health of the organization.

  • Profitability Analysis: Forecasting sales in dollars allows for a more accurate assessment of profitability. By incorporating cost data, such as production costs, operational expenses, and pricing strategies, businesses can forecast the expected profit margins and make informed decisions to optimize profitability.

  • Financial Planning and Budgeting: Dollar-based forecasts are essential for financial planning, including setting revenue targets, allocating resources, and creating budgets. It enables businesses to align their financial goals and allocate funds effectively to different areas of the organization.

  • Pricing and Product Strategy: Forecasting sales in dollars helps in evaluating the impact of pricing strategies on revenue. By analyzing different pricing scenarios, businesses can forecast the potential revenue changes and optimize pricing decisions to maximize revenue and market competitiveness.

  • Investor Relations: For businesses that are publicly traded or seeking external funding, forecasting in dollars is essential for communicating financial performance to investors and stakeholders. It provides a clear understanding of the expected revenue and financial outcomes, aiding investor relations and decision-making.

  • Financial Reporting and Compliance: Many financial reporting standards and compliance requirements, such as Generally Accepted Accounting Principles (GAAP) or International Financial Reporting Standards (IFRS), require reporting financial results in dollar values. Forecasting in dollars ensures compliance with these standards and facilitates accurate financial reporting.

While forecasting in dollars is valuable for financial-oriented analyses and decision-making, it's important to note that forecasting in units can still be relevant for operational planning, demand forecasting, and supply chain management. The choice between forecasting in dollars or units depends on the specific business context and the objectives of the analysis.

What is the underlying assumption when we fill missing values with zeros? What are the alternative approaches to deal with missing values?¶

When we fill missing values with zeros, the underlying assumption is that the missing values represent an absence or lack of the variable being measured. In other words, the missing values are assumed to have a value of zero.

However, filling missing values with zeros may not always be appropriate and can introduce bias or distort the data. The choice of how to handle missing values depends on the nature of the data and the analysis being performed.

Here are some alternative approaches to deal with missing values:

  • Mean/Median/Mode Imputation: Instead of filling missing values with zeros, we can impute them with the mean, median, or mode of the available data. This approach assumes that the missing values are similar to the observed values and uses summary statistics to estimate the missing values.

  • Forward or Backward Fill: This approach involves propagating the last observed value forward or the next observed value backward to fill in the missing values. It assumes that the variable remains constant between successive observations.

  • Interpolation: Interpolation methods estimate missing values based on the trend or pattern of the data. Common interpolation techniques include linear interpolation, spline interpolation, or time-based interpolation.

  • Multiple Imputation: Multiple imputation involves generating multiple plausible values for missing data based on statistical models. It takes into account the uncertainty associated with missing values and provides a more robust estimate by considering the variability in the imputed values.

  • Model-based Imputation: This approach involves using predictive models to estimate missing values. The models are trained on the available data to predict the missing values based on other variables.

  • Dropping Missing Values: In some cases, if the proportion of missing values is small and randomly distributed, it may be reasonable to exclude the observations with missing values from the analysis. However, this approach should be used with caution as it may introduce selection bias.

The choice of the approach to handle missing values depends on factors such as the amount of missing data, the nature of the variables, the underlying data distribution, and the analysis objectives. It is important to carefully consider the implications of each approach and select the most suitable method based on the specific context and the potential impact on the analysis results.

How would you communicate the results to other data scientists / to the client?¶

When communicating the results to other data scientists or to the client, it's important to present the findings in a clear and understandable manner. Here are some key points to consider:

  • Executive Summary: Start by providing a brief overview of the project objectives, methodology used, and the key findings. This allows the reader to quickly grasp the main insights without going into too much detail.

  • Visualizations: Utilize visualizations such as charts, graphs, and tables to present the results in a visually appealing and intuitive manner. Visuals can effectively communicate patterns, trends, and relationships in the data.

  • Interpretation: Provide a clear and concise interpretation of the results. Explain the implications of the findings and their relevance to the problem or question at hand. Avoid jargon and technical terms, and instead focus on explaining the insights in a language that the intended audience can easily understand.

  • Contextualize the Results: Provide context for the results by considering external factors, industry benchmarks, or previous research. This helps to give a broader perspective and allows the audience to understand the significance of the findings within a larger context.

  • Limitations and Assumptions: Discuss the limitations and assumptions made during the analysis. Acknowledge any potential sources of bias, uncertainty, or data quality issues. This demonstrates transparency and helps the audience to properly interpret and evaluate the results.

  • Recommendations: Based on the findings, provide actionable recommendations or next steps. These recommendations should be aligned with the project objectives and should address the specific problem or question that was being investigated.

  • Clear Documentation: Ensure that all the data, code, and methodologies used in the analysis are well-documented and easily accessible. This allows other data scientists or the client to review and replicate the analysis if needed.

  • Tailor the Communication: Consider the technical proficiency and background of the audience when communicating the results. Adjust the level of technical detail and provide additional explanations or context as necessary.

  • Engage in Discussion: Encourage open discussion and feedback from other data scientists or the client. This can help to validate the findings, address any concerns or questions, and foster collaboration and further exploration of the results.

Overall, the key is to present the results in a concise, informative, and engaging manner that effectively conveys the insights and implications of the analysis to the intended audience.